I’ve written a python script that resamples and renames a ton of audio data and moves it to a new location on disk. I’d like to use this script to move the data I’m resampling to a google storage bucket.
Question: Is there a way to connect/mount your GCP VM instance to a bucket in such a way that reading and writing can be done as if the bucket is just another directory?
For example, this is somewhere in my script:
# load audio from old location audio, _ = librosa.load(old_path): # Do some stuff to the audio # ... # write audio to new location with sf.SoundFile(new_path, 'w', sr, channels=1, format='WAV') as f: f.write(audio)
I’d like to have a way to get the path
to my bucket because my script takes an old_path
where the original data is, resamples and moves it to a new_path
.
My script would not be as simple to modify as the snippet above makes it seem, because I’m doing a lot of multiprocessing. Plus I’d like to make the script generic so I can re-use it for local files, etc. Basically, altering the script is off the table.
Advertisement
Answer
You could use the FUSE adapter to mount your GCS bucket onto the local filesystem
https://cloud.google.com/storage/docs/gcs-fuse
For Linux:
sudo apt-get update sudo apt-get install gcsfuse gcsfuse mybucket /my/path
Alternatively you could use the GCS Client for Python to upload your content directly:
https://cloud.google.com/storage/docs/reference/libraries#client-libraries-usage-python