I want to list all the blobs in a container and then ultimately store each blobs contents (each blob stores a csv file) into a data frame, it appears that the blob service client is the easiest way to list all the blobs, and this is what I have:
JavaScript
x
22
22
1
#!/usr/bin/env python3
2
3
import os
4
from azure.storage.blob import BlobServiceClient, BlobClient, ContainerClient
5
from pathlib import Path
6
from io import StringIO
7
import pandas as pd
8
9
def main():
10
connect_str = os.environ['AZURE_CONNECT_STR']
11
container = os.environ['CONTAINER']
12
13
print(connect_str + "n")
14
blob_service_client = BlobServiceClient.from_connection_string(connect_str)
15
container_client = blob_service_client.get_container_client(container)
16
blob_list = container_client.list_blobs()
17
for blob in blob_list:
18
print("t" + blob.name)
19
20
if __name__ == "__main__":
21
main()
22
However, in the last version of blob storage client there appears to be no method which allows me to get the actual contents of the blob, what code should I be using ? there are other clients in the Python SDK for Azure, but it getting a full list of the blobs in a container using these seems cumbersome.
Advertisement
Answer
What you would need to do is create an instance of BlobClient
using the container_client
and the blob’s name. You can then call download_blob
method to download the blob.
Something like:
JavaScript
1
5
1
for blob in blob_list:
2
print("t" + blob.name)
3
blob_client = container_client.get_blob_client(blob.name)
4
blob_client.download( )
5