Skip to content
Advertisement

Include only .gz extension files from S3 bucket

I want to process/download .gz files from S3 bucket. There are more than 10,000 files on S3 so I am using

import boto3
 
s3 = boto3.resource('s3')
bucket = s3.Bucket('my-bucket')
objects = bucket.objects.all()
 
for object in objects:
    print(object.key)

This lists .txt files which I want to avoid. How can I do that?

Advertisement

Answer

The easiest way to filter objects by name or suffix is to do it within Python, such as using .endswith() to include/exclude objects.

You can Filter by Prefix, but not by suffix.

Advertisement