Skip to content
Advertisement

Python Pandas Mixed Type Warning – “dtype” preserves data?

I have this code that gives this warning:

JavaScript

I have searched across both google and stackoverflow and people seem to give two kinds of solutions:

  1. low_memory = False
  2. converters

Problem with #1 is it merely silences the warning but does not solve the underlying problem (correct me if I am wrong).

Problem with #2 is converters might do things we don’t like. Some say they are inefficient too but I don’t know.

I have come up with a simpler solution:

  • Find what is general data type of the problematic column
  • pass the dtype option while reading the data.

e.g. in my case most of the elements in the problematic columns are supposed to be strings, hence I have passed this:

JavaScript

I don’t get the warning anymore but will this preserve the data? Since I can’t check 6000 values in each of the three columns manually, will this convert any integer or float to string without modifying it (3.09 –> “3.09”)? What happens to NaN values?

Advertisement

Answer

You have different choices to read your file

JavaScript

Case 1: let Pandas determines datatype

JavaScript

Case 2: add strings to recognize NaN values and let Pandas determines datatype

JavaScript

Case 3: add strings to recognize NaN values but keep data as plain text

JavaScript
User contributions licensed under: CC BY-SA
4 People found this is helpful
Advertisement