Skip to content
Advertisement

Pandas: ValueError: cannot convert float NaN to integer

I get ValueError: cannot convert float NaN to integer for following:

JavaScript
  • The “x” is a column in the csv file, I cannot spot any float NaN in the file, and I don’t understand the error or why I am getting it.
  • When I read the column as String, then it has values like -1,0,1,…2000, all look very nice int numbers to me.
  • When I read the column as float, then this can be loaded. Then it shows values as -1.0,0.0 etc, still there are no any NaN-s
  • I tried with error_bad_lines = False and dtype parameter in read_csv to no avail. It just cancels loading with same exception.
  • The file is not small (10+ M rows), so cannot inspect it manually, when I extract a small header part, then there is no error, but it happens with full file. So it is something in the file, but cannot detect what.
  • Logically the csv should not have missing values, but even if there is some garbage then I would be ok to skip the rows. Or at least identify them, but I do not see way to scan through file and report conversion errors.

Update: Using the hints in comments/answers I got my data clean with this:

JavaScript

Advertisement

Answer

For identifying NaN values use boolean indexing:

JavaScript

Then for removing all non-numeric values use to_numeric with parameter errors='coerce' – to replace non-numeric values to NaNs:

JavaScript

And for remove all rows with NaNs in column x use dropna:

JavaScript

Last convert values to ints:

JavaScript
User contributions licensed under: CC BY-SA
2 People found this is helpful
Advertisement