Skip to content
Advertisement

Writing a pickle file to an s3 bucket in AWS

I’m trying to write a pandas dataframe as a pickle file into an s3 bucket in AWS. I know that I can write dataframe new_df as a csv to an s3 bucket as follows:

bucket='mybucket'
key='path'

csv_buffer = StringIO()
s3_resource = boto3.resource('s3')

new_df.to_csv(csv_buffer, index=False)
s3_resource.Object(bucket,path).put(Body=csv_buffer.getvalue())

I’ve tried using the same code as above with to_pickle() but with no success.

Advertisement

Answer

I’ve found the solution, need to call BytesIO into the buffer for pickle files instead of StringIO (which are for CSV files).

import io
import boto3

pickle_buffer = io.BytesIO()
s3_resource = boto3.resource('s3')

new_df.to_pickle(pickle_buffer)
s3_resource.Object(bucket, key).put(Body=pickle_buffer.getvalue())
User contributions licensed under: CC BY-SA
2 People found this is helpful
Advertisement