Skip to content
Advertisement

Tag: text-parsing

PySpark 2.4 – Read CSV file with custom line separator

Support for custom line separators (for various text file formats) was added to spark in 2017 (see: https://github.com/apache/spark/pull/18581). … or maybe it wasn’t added in 2017 – or ever (see: https://github.com/apache/spark/pull/18304) Today, with Pyspark 2.4.0 I am unable to use custom line separators to parse CSV files. Here’s some code: Here’s two sample csv files: one.csv – lines are separated

Advertisement