I am trying to read a GeoJSON with Python Polars, like this:
JavaScript
x
4
1
import polars as pl
2
myfile = '{"type":"GeometryCollection","geometries":[{"type":"Linestring","coordinates":[[10,11.2],[10.5,11.9]]},{"type":"Point","coordinates":[10,20]}]}'
3
pl.read_json(myfile)
4
The error I get is:
JavaScript
1
7
1
Traceback (most recent call last):
2
File "<stdin>", line 1, in <module>
3
File "...local-packagesPython39site-packagespolarsfunctions.py", line 631, in read_json return DataFrame.read_json(source) # type: ignore
4
File "...local-packagesPython39site-packagespolarsframe.py", line 346, in read_json
5
self._df = PyDataFrame.read_json(file)
6
RuntimeError: Other("Error("missing field `columns`", line: 1, column: 143)")
7
I have also tried to put the same content into a file and I had a similar error.
As suggested in GitHub, I tried to read the file via Pandas, like this:
JavaScript
1
3
1
import pandas as pd
2
initial_df = pl.from_pandas(pd.read_json(file_path))
3
The error I get is:
JavaScript
1
11
11
1
File "...file_splitter.py", line 13, in split_file
2
initial_df = pl.from_pandas(pd.read_json(file_path))
3
File "...local-packagesPython39site-packagespolarsfunctions.py", line 566, in from_pandas
4
data[name] = _from_pandas_helper(s)
5
File "...local-packagesPython39site-packagespolarsfunctions.py", line 534, in _from_pandas_helper
6
return pa.array(a)
7
File "pyarrowarray.pxi", line 302, in pyarrow.lib.array
8
File "pyarrowarray.pxi", line 83, in pyarrow.lib._ndarray_to_array
9
File "pyarrowerror.pxi", line 97, in pyarrow.lib.check_status
10
pyarrow.lib.ArrowInvalid: cannot mix list and non-list, non-null values
11
What can I do to read the GeoJSON file?
Advertisement
Answer
If you read the file with pandas you get columsn of type Object
where one is not known to Arrow
(it could be anything).
If we cast the columns to type string we know that arrow and polars can deal with it.
JavaScript
1
3
1
myfile = '{"type":"GeometryCollection","geometries":[{"type":"Linestring","coordinates":[[10,11.2],[10.5,11.9]]},{"type":"Point","coordinates":[10,20]}]}'
2
print(pl.from_pandas(pd.read_json(myfile).astype(str)))
3
JavaScript
1
12
12
1
shape: (2, 2)
2
┌────────────────────┬─────────────────────────────────────┐
3
│ type ┆ geometries │
4
│ --- ┆ --- │
5
│ str ┆ str │
6
╞════════════════════╪═════════════════════════════════════╡
7
│ GeometryCollection ┆ {'type': 'Linestring', 'coordina... │
8
├╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
9
│ GeometryCollection ┆ {'type': 'Point', 'coordinates': │
10
└────────────────────┴─────────────────────────────────────┘
11
12