I want to list all the blobs in a container and then ultimately store each blobs contents (each blob stores a csv file) into a data frame, it appears that the blob service client is the easiest way to list all the blobs, and this is what I have:
#!/usr/bin/env python3 import os from azure.storage.blob import BlobServiceClient, BlobClient, ContainerClient from pathlib import Path from io import StringIO import pandas as pd def main(): connect_str = os.environ['AZURE_CONNECT_STR'] container = os.environ['CONTAINER'] print(connect_str + "n") blob_service_client = BlobServiceClient.from_connection_string(connect_str) container_client = blob_service_client.get_container_client(container) blob_list = container_client.list_blobs() for blob in blob_list: print("t" + blob.name) if __name__ == "__main__": main()
However, in the last version of blob storage client there appears to be no method which allows me to get the actual contents of the blob, what code should I be using ? there are other clients in the Python SDK for Azure, but it getting a full list of the blobs in a container using these seems cumbersome.
Advertisement
Answer
What you would need to do is create an instance of BlobClient
using the container_client
and the blob’s name. You can then call download_blob
method to download the blob.
Something like:
for blob in blob_list: print("t" + blob.name) blob_client = container_client.get_blob_client(blob.name) blob_client.download(...)