Skip to content
Advertisement

Converting Pandas dataframe into Spark dataframe error

I’m trying to convert Pandas DF into Spark one. DF head:

JavaScript

Code:

JavaScript

And I got an error:

JavaScript

Advertisement

Answer

You need to make sure your pandas dataframe columns are appropriate for the type spark is inferring. If your pandas dataframe lists something like:

JavaScript

And you’re getting that error try:

JavaScript

Now, make sure .astype(str) is actually the type you want those columns to be. Basically, when the underlying Java code tries to infer the type from an object in python it uses some observations and makes a guess, if that guess doesn’t apply to all the data in the column(s) it’s trying to convert from pandas to spark it will fail.

User contributions licensed under: CC BY-SA
4 People found this is helpful
Advertisement