Skip to content
Advertisement

How to download a list of files from Azure Blob Storage given SAS URI and container name using Python?

I have the container name and its folder structure. I need to download all files in a single folder in the container using python code. I also have a SAS URL link to this particular folder.

The method I have found online use BlockBlobService class, which is part of the old SDK. I need to find a way to do it using the current SDK.

Can you please help me with this?

Edit 1:

this is my SAS URL: https://xxxx.blob.core.windows.net/<CONTAINER>/<FOLDER>?sp=r&st=2022-05-31T17:49:47Z&se=2022-06-05T21:59:59Z&sv=2020-08-04&sr=c&sig=9M8ql9nYOhEYdmAOKUyetWbCU8hoWS72UFczkShdbeY%3D

Edit 2:

added link to the method found.

Edit 3:

I also have the full path of the files that I want to download.

Advertisement

Answer

Please try this code (untested though).

The code below basically parses the SAS URL and creates an instance of ContainerClient. Then it lists the blobs in that container names of which start with the folder name. Once you have that list, you can download individual blobs.

I noticed that your SAS URL only has read permission (sp=r). Please note that you would need both read and list permissions (sp=rl). You will need to ask for new SAS URL with these two permissions.

from urllib.parse import urlparse
from azure.storage.blob import ContainerClient

sasUrl = "https://xxxx.blob.core.windows.net/<CONTAINER>/<FOLDER>?sp=r&st=2022-05-31T17:49:47Z&se=2022-06-05T21:59:59Z&sv=2020-08-04&sr=c&sig=9M8ql9nYOhEYdmAOKUyetWbCU8hoWS72UFczkShdbeY%3D"

sasUrlParts = urlparse(sasUrl)

accountEndpoint = sasUrlParts.scheme + '://' + sasUrlParts.netloc

sasToken = sasUrlParts.query

pathParts = sasUrlParts.path.split('/')

containerName = pathParts[1]

folderName = pathParts[2]

containerClient = ContainerClient(accountEndpoint, containerName, sasToken)

blobs = containerClient.list_blobs(folderName)

for blob in blobs_list:
  blobClient = containerClient.get_blob_client(blob)
  download the blob here...blobClient.download()

UPDATE

I have the SAS URL mentioned above, and the full paths of the files I want to download. For example: PATH 1 : Container/folder/file1.csv, PATH 2 : Container/folder/file2.txt, and so on

Please see the code below:

from urllib.parse import urlparse
from azure.storage.blob import BlobClient

sasUrl = "https://xxxx.blob.core.windows.net/<CONTAINER>/<FOLDER>?sp=r&st=2022-05-31T17:49:47Z&se=2022-06-05T21:59:59Z&sv=2020-08-04&sr=c&sig=9M8ql9nYOhEYdmAOKUyetWbCU8hoWS72UFczkShdbeY%3D"

blobNameWithContainer = "Container/folder/file1.csv"

sasUrlParts = urlparse(sasUrl)

accountEndpoint = sasUrlParts.scheme + '://' + sasUrlParts.netloc

sasToken = sasUrlParts.query

blobSasUrl = accountEndpoint + '/' + blobNameWithContainer + '?' + sasToken;

blobClient = BlobClient.from_blob_url(blobSasUrl);
.... now do any operation on that blob ...
User contributions licensed under: CC BY-SA
8 People found this is helpful
Advertisement