I am trying to understand and learn how to get all my files from the specific bucket into one csv file. I have the files that are like logs and are always in the same format and are kept in the same bucket. I have this code to access them and read them:
JavaScript
x
5
1
bucket = s3_resource.Bucket(bucket_name)
2
for obj in bucket.objects.all():
3
x = obj.get()['Body'].read().decode('utf-8')
4
print(x)
5
It does print them with separation between specific files and also column headers.
The question I have got is, how can I modify my loop to get them into just one csv file?
Advertisement
Answer
You should create a file in /tmp/
and write
the contents of each object into that file.
Then, when all files have been read, upload the file (or do whatever you want to do with it).
JavaScript
1
8
1
output = open('/tmp/outfile.txt', 'w')
2
3
bucket = s3_resource.Bucket(bucket_name)
4
for obj in bucket.objects.all():
5
output.write(obj.get()['Body'].read().decode('utf-8'))
6
7
output.close
8
Please note that there is a limit of 512MB in the /tmp/
directory.