Skip to content
Advertisement

How to read a gzip compressed json lines file into PySpark dataframe?

I have a JSON-lines file that I wish to read into a PySpark data frame. the file is gzipped compressed.

The filename looks like this: file.jl.gz

I know how to read this file into a pandas data frame:

JavaScript

I’m new to pyspark, and I’d like to learn the pyspark equivalent of this. Is there a way to read this file into pyspark dataframes?

EDIT 2

JavaScript

I executed the above command & got this error.

JavaScript

Advertisement

Answer

JavaScript
User contributions licensed under: CC BY-SA
5 People found this is helpful
Advertisement