Skip to content
Advertisement

What is the equivalent of connecting to google cloud storage(gcs) like in aws s3 using s3fs?

I want to access google cloud storage as in the code below.

# amazon s3 connection
import s3fs as fs 

with fs.open("s3://mybucket/image1.jpg") as f:
    image = Image.open(f).convert("RGB")


# Is there an equivalent code like this GCP side?
with cloudstorage.open("gs://my_bucket/image1.jpg") as f:
     image = Image.open(f).convert("RGB")

Advertisement

Answer

You’re looking for gcsfs. Both s3fs and gcsfs are part of the fsspec project and have very similar APIs.

import gcsfs

fs = gcsfs.GCSFileSystem()

with fs.open("gs://my_bucket/image1.jpg") as f:
     image = Image.open(f).convert("RGB")

Note that both of these can be accessed from the fsspec interface, as long as you have the underlying drivers installed, e.g.:

import fsspec

with fsspec.open('s3://my_s3_bucket/image1.jpg') as f:
    image1 = Image.open(f).convert("RGB")

with fsspec.open('gs://my_gs_bucket/image1.jpg') as f:
    image2 = Image.open(f).convert("RGB")

# fsspec handles local paths too!
with fsspec.open('/Users/myname/Downloads/image1.jpg') as f:
    image3 = Image.open(f).convert("RGB")

fsspec is the file system handler underlying pandas and other libraries which parse cloud URLs. The reason the following “just works” is because fsspec is providing the cloud URI handling:

pd.read_csv("s3://path/to/my/aws.csv")
pd.read_csv("gs://path/to/my/google.csv")
pd.read_csv("my/local.csv")
Advertisement