Skip to content
Advertisement

How to read tsv file from vaex and output a pyarrow parquet file?

On these vaex and pyarrow version:

JavaScript

When reading a tsv file and exporting it to arrow, the arrow table couldn’t be properly loaded by pyarrow.read_table(), e.g. given a file, e.g. s2t.tsv:

JavaScript

The file looks like this:

JavaScript

And when I tried exporting the tsv to arrow as such, then reading it back:

JavaScript

It throws the following error:

JavaScript

Is there some additional args/kwargs that should be added when exporting or reading the parquet files?

Or is the exporting to arrow bugged/broken somehow?

Advertisement

Answer

According to https://github.com/vaexio/vaex/issues/2228

JavaScript

will export to the right format that can be read by

JavaScript
User contributions licensed under: CC BY-SA
1 People found this is helpful
Advertisement