How to download latest n items from AWS S3 bucket using boto3?

Question

I have an S3 bucket where my application saves some final result DataFrames as .csv files. I would like to download the latest 1000 files in this bucket, but I don&#8217;t know how to do it. I cannot do it mannualy, as the bucket doesn&#8217;t allow me to sort the files by date because it has more than 1000 e…

Accepted Answer

If your application uploads files periodically, you could try this:import boto3import datetimelast_n_days = 250s3 = boto3.client('s3')paginator = s3.get_paginator('list_objects_v2')pages = paginator.paginate(Bucket='bucket', Prefix='processed')date_limit = datetime.datetime.now() - datetime.timedelta(30)for page in pages:    for obj in page['Contents']:        if obj['LastModified'] >= date_limit and obj['Key'][-1] != '/':             s3.download_file('bucket', obj['Key'], obj['Key'].split('/')[-1])With the script above, all files modified in the last 250 days will be downloaded. If your application uploads 4 files per day, this could do the fix.

Advertisement

Answer