Skip to content
Advertisement

How to keep dtypes when reading a parquet file(read_parquet()) in pandas?

Code:

JavaScript

As you can see here, [{'b': 1}] becomes [{'b': 1.0}].

How can I keep dtypes even in reading the parquet file?

Advertisement

Answer

You can try to use pyarrow.parquet.read_table and pyarrow.Table.to_pandas with integer_object_nulls (see the doc)

JavaScript
a
0 [{‘b’: 1}]
1 [{‘b’: None}]

On the other hand, it looks like pandas.read_parquet with use_nullable_dtypes doesn’t work.

JavaScript
a
0 [{‘b’: 1.0}]
1 [{‘b’: None}]
User contributions licensed under: CC BY-SA
2 People found this is helpful
Advertisement