AWS Lambda – Combine multiple CSV files from S3 into one file

Question

I am trying to understand and learn how to get all my files from the specific bucket into one csv file. I have the files that are like logs and are always in the same format and are kept in the same bucket. I have this code to access them and read them: It does print them with separation between

Accepted Answer

You should create a file in /tmp/ and write the contents of each object into that file.Then, when all files have been read, upload the file (or do whatever you want to do with it).output = open('/tmp/outfile.txt', 'w')bucket = s3_resource.Bucket(bucket_name)for obj in bucket.objects.all():    output.write(obj.get()['Body'].read().decode('utf-8'))    output.closePlease note that there is a limit of 512MB in the /tmp/ directory.

Advertisement

Answer