Skip to content
Advertisement

Apache Spark unable to recognize columns in UTF-16 csv file

Question: Why I am getting following error on the last line of the code below, how the issue can be resolved?

AttributeError: ‘DataFrame’ object has no attribute ‘OrderID’

CSV File encoding: UTF-16 LE BOM

Number of columns: 150

Rows: 5000

Language etc.: Python, Apache Spark, Azure-Databricks

MySampleDataFile.txt:

JavaScript

Code sample:

JavaScript

Output of display(df.limit(4)) It successfully displays the content of df in a tabular format with column header – similar to the example here:

JavaScript

Advertisement

Answer

AttributeError: ‘DataFrame’ object has no attribute ‘OrderID’

how the issue can be resolved?

You can try the following way to change the data type.

JavaScript

ORAlternative way,

pyspark.sql.functions.col It will return a column depending on the name of the provided column.

JavaScript
Advertisement