I’m trying to write a pandas dataframe as a pickle file into an s3 bucket in AWS. I know that I can write dataframe new_df as a csv to an s3 bucket as follows:
bucket='mybucket'
key='path'
csv_buffer = StringIO()
s3_resource = boto3.resource('s3')
new_df.to_csv(csv_buffer, index=False)
s3_resource.Object(bucket,path).put(Body=csv_buffer.getvalue())
I’ve tried using the same code as above with to_pickle() but with no success.
Advertisement
Answer
I’ve found the solution, need to call BytesIO into the buffer for pickle files instead of StringIO (which are for CSV files).
import io
import boto3
pickle_buffer = io.BytesIO()
s3_resource = boto3.resource('s3')
new_df.to_pickle(pickle_buffer)
s3_resource.Object(bucket, key).put(Body=pickle_buffer.getvalue())