Skip to content

Commit

Permalink
Google Cloud Storage Support (#113) (#114)
Browse files Browse the repository at this point in the history
* Google Cloud Storage Support (#113)

* Update README and imports for Google Cloud Storage

* Write Google Storage download and upload methods

* Change key to blob for GSClient

* Implement pass methods for GSClient

* Write live tests for GSClient and GSPath

* Unify GSClient methods

* Add Mock Google Storage fixtures for testing

* Fix Google Storage mock bucket copy_blob method

* Change get_bucket method to bucket for GSClient

* Update test mocks for Google Storage

* Add documentation for Google Storage

* Expand authentication options

* Update metadata fetcher for GSPath

* Add Google Cloud setup action to CI workflow

* Tweaks for tests to work on Windows

Co-authored-by: Macklan Weinstein <[email protected]>
  • Loading branch information
pjbull and Macklan Weinstein authored Jan 24, 2021
1 parent ef9d162 commit 9da0f14
Show file tree
Hide file tree
Showing 15 changed files with 567 additions and 83 deletions.
8 changes: 8 additions & 0 deletions .github/workflows/tests.yml
Original file line number Diff line number Diff line change
Expand Up @@ -38,12 +38,20 @@ jobs:
run: |
make test
- name: Set up Cloud SDK
uses: google-github-actions/setup-gcloud@master
with:
project_id: ${{ secrets.GCP_PROJECT_ID }}
service_account_key: ${{ secrets.GCP_SA_KEY }}
export_default_credentials: true

- name: Run live tests
run: |
make test-live-cloud
env:
LIVE_AZURE_CONTAINER: ${{ secrets.LIVE_AZURE_CONTAINER }}
AZURE_STORAGE_CONNECTION_STRING: ${{ secrets.AZURE_STORAGE_CONNECTION_STRING }}
LIVE_GS_BUCKET: ${{ secrets.LIVE_GS_BUCKET }}
LIVE_S3_BUCKET: ${{ secrets.LIVE_S3_BUCKET }}
AWS_ACCESS_KEY_ID: ${{ secrets.AWS_ACCESS_KEY_ID }}
AWS_SECRET_ACCESS_KEY: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
Expand Down
2 changes: 1 addition & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -107,4 +107,4 @@ ENV/
.mypy_cache/

# IDE settings
.vscode/
.vscode/
147 changes: 76 additions & 71 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@ with CloudPath("s3://bucket/filename.txt").open("w+") as f:
## Why use cloudpathlib?

- **Familiar**: If you know how to interact with `Path`, you know how to interact with `CloudPath`. All of the cloud-relevant `Path` methods are implemented.
- **Supported clouds**: AWS S3 and Azure Blob Storage are implemented. Google Cloud Storage and FTP are on the way.
- **Supported clouds**: AWS S3, Google Cloud Storage, and Azure Blob Storage are implemented. FTP is on the way.
- **Extensible**: The base classes do most of the work generically, so implementing two small classes `MyPath` and `MyClient` is all you need to add support for a new cloud storage service.
- **Read/write support**: Reading just works. Using the `write_text`, `write_bytes` or `.open('w')` methods will all upload your changes to cloud storage without any additional file management as a developer.
- **Seamless caching**: Files are downloaded locally only when necessary. You can also easily pass a persistent cache folder so that across processes and sessions you only re-download what is necessary.
Expand All @@ -27,15 +27,15 @@ with CloudPath("s3://bucket/filename.txt").open("w+") as f:

## Installation

`cloudpathlib` depends on the cloud services' SDKs (e.g., `boto3`, `azure-storage-blob`) to communicate with their respective storage service. If you try to use cloud paths for a cloud service for which you don't have dependencies installed, `cloudpathlib` will error and let you know what you need to install.
`cloudpathlib` depends on the cloud services' SDKs (e.g., `boto3`, `google-cloud-storage`, `azure-storage-blob`) to communicate with their respective storage service. If you try to use cloud paths for a cloud service for which you don't have dependencies installed, `cloudpathlib` will error and let you know what you need to install.

To install a cloud service's SDK dependency when installing `cloudpathlib`, you need to specify it using pip's ["extras"](https://packaging.python.org/tutorials/installing-packages/#installing-setuptools-extras) specification. For example:

```bash
pip install cloudpathlib[s3,azure]
pip install cloudpathlib[s3,gs,azure]
```

Currently supported cloud storage services are: `azure`, `s3`. You can also use `all` to install all available services' dependencies.
Currently supported cloud storage services are: `azure`, `gs`, `s3`. You can also use `all` to install all available services' dependencies.

If you do not specify any extras or separately install any cloud SDKs, you will only be able to develop with the base classes for rolling your own cloud path class.

Expand Down Expand Up @@ -114,73 +114,78 @@ list(root_dir.glob('**/*.txt'))

Most methods and properties from `pathlib.Path` are supported except for the ones that don't make sense in a cloud context. There are a few additional methods or properties that relate to specific cloud services or specifically for cloud paths.

| Methods + properties | `AzureBlobPath` | `S3Path` |
|:-----------------------|:------------------|:-----------|
| `anchor` |||
| `as_uri` |||
| `drive` |||
| `exists` |||
| `glob` |||
| `is_dir` |||
| `is_file` |||
| `iterdir` |||
| `joinpath` |||
| `match` |||
| `mkdir` |||
| `name` |||
| `open` |||
| `parent` |||
| `parents` |||
| `parts` |||
| `read_bytes` |||
| `read_text` |||
| `rename` |||
| `replace` |||
| `rglob` |||
| `rmdir` |||
| `samefile` |||
| `stat` |||
| `stem` |||
| `suffix` |||
| `suffixes` |||
| `touch` |||
| `unlink` |||
| `with_name` |||
| `with_suffix` |||
| `write_bytes` |||
| `write_text` |||
| `absolute` |||
| `as_posix` |||
| `chmod` |||
| `cwd` |||
| `expanduser` |||
| `group` |||
| `home` |||
| `is_absolute` |||
| `is_block_device` |||
| `is_char_device` |||
| `is_fifo` |||
| `is_mount` |||
| `is_reserved` |||
| `is_socket` |||
| `is_symlink` |||
| `lchmod` |||
| `link_to` |||
| `lstat` |||
| `owner` |||
| `relative_to` |||
| `resolve` |||
| `root` |||
| `symlink_to` |||
| `cloud_prefix` |||
| `download_to` |||
| `etag` |||
| `is_valid_cloudpath` |||
| `blob` |||
| `bucket` |||
| `container` |||
| `key` |||
| `md5` |||
| Methods + properties | `AzureBlobPath` | `S3Path` | `GSPath` |
|:-----------------------|:------------------|:-----------|:-----------|
| `anchor` ||||
| `as_uri` ||||
| `drive` ||||
| `exists` ||||
| `glob` ||||
| `is_dir` ||||
| `is_file` ||||
| `iterdir` ||||
| `joinpath` ||||
| `match` ||||
| `mkdir` ||||
| `name` ||||
| `open` ||||
| `parent` ||||
| `parents` ||||
| `parts` ||||
| `read_bytes` ||||
| `read_text` ||||
| `rename` ||||
| `replace` ||||
| `rglob` ||||
| `rmdir` ||||
| `samefile` ||||
| `stat` ||||
| `stem` ||||
| `suffix` ||||
| `suffixes` ||||
| `touch` ||||
| `unlink` ||||
| `with_name` ||||
| `with_suffix` ||||
| `write_bytes` ||||
| `write_text` ||||
| `absolute` ||||
| `as_posix` ||||
| `chmod` ||||
| `cwd` ||||
| `expanduser` ||||
| `group` ||||
| `home` ||||
| `is_absolute` ||||
| `is_block_device` ||||
| `is_char_device` ||||
| `is_fifo` ||||
| `is_mount` ||||
| `is_relative_to` ||||
| `is_reserved` ||||
| `is_socket` ||||
| `is_symlink` ||||
| `lchmod` ||||
| `link_to` ||||
| `lstat` ||||
| `owner` ||||
| `readlink` ||||
| `relative_to` ||||
| `resolve` ||||
| `root` ||||
| `symlink_to` ||||
| `with_stem` ||||
| `cloud_prefix` ||||
| `download_to` ||||
| `etag` ||||
| `fspath` ||||
| `is_valid_cloudpath` ||||
| `rmtree` ||||
| `blob` ||||
| `bucket` ||||
| `container` ||||
| `key` ||||
| `md5` ||||

----

Expand Down
4 changes: 4 additions & 0 deletions cloudpathlib/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,8 @@
from .azure.azblobpath import AzureBlobPath
from .cloudpath import CloudPath
from .s3.s3client import S3Client
from .gs.gspath import GSPath
from .gs.gsclient import GSClient
from .s3.s3path import S3Path


Expand Down Expand Up @@ -33,6 +35,8 @@
"CloudPath",
"DirectoryNotEmpty",
"InvalidPrefix",
"GSClient",
"GSPath",
"MissingDependencies",
"OverwriteDirtyFile",
"OverwriteNewerLocal",
Expand Down
7 changes: 7 additions & 0 deletions cloudpathlib/gs/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
from .gsclient import GSClient
from .gspath import GSPath

__all__ = [
"GSClient",
"GSPath",
]
Loading

0 comments on commit 9da0f14

Please sign in to comment.