Apache Spark unable to recognize columns in UTF-16 csv file

Question

Question: Why I am getting following error on the last line of the code below, how the issue can be resolved? AttributeError: &#8216;DataFrame&#8217; object has no attribute &#8216;OrderID&#8217; CSV File encoding: UTF-16 LE BOM Number of columns: 150 Rows: 5000 Language etc.: Python, Apache Spark, Azure-Data…

Accepted Answer

AttributeError: &#8216;DataFrame&#8217; object has no attribute &#8216;OrderID&#8217;how the issue can be resolved?You can try the following way to change the data type.df1 = df.withColumn("OrderID", df[“OrderID”].cast(DoubleType()))OR &#8211; Alternative way,pyspark.sql.functions.col It will return a column depending on the name of the provided column.from pyspark.sql.functions import col  df1 = df.withColumn("OrderID", col("OrderID").cast(DoubleType()))

Advertisement

Answer