Skip to content
Advertisement

Transforming a pandas df to a parquet-file-bytes-object

I have a pandas dataframe and want to write it as a parquet file to the Azure file storage.

So far I have not been able to transform the dataframe directly into a bytes which I then can upload to Azure. My current workaround is to save it as a parquet file to the local drive, then read it as a bytes object which I can upload to Azure.

Can anyone tell me how I can transform a pandas dataframe directly to a “parquet file”-bytes object without writing it to a disk? The I/O operation is really slowing things down and it feels a lot like really ugly code…

JavaScript

I’m looking to implement something like this, where the transform_functionality returns a bytes object:

JavaScript

Advertisement

Answer

I have found a solution, I will post it here in case anyone needs to do the same task. After writing it with the to_parquet file to a buffer, I get the bytes object out of the buffer with the .getvalue() functionality as follows:

JavaScript
User contributions licensed under: CC BY-SA
6 People found this is helpful
Advertisement