Skip to content
Advertisement

How to drop column according to NAN percentage for dataframe?

For certain columns of df, if 80% of the column is NAN.

What’s the simplest code to drop such columns?

Advertisement

Answer

You can use isnull with mean for threshold and then remove columns by boolean indexing with loc (because remove columns), also need invert condition – so <.8 means remove all columns >=0.8:

JavaScript

Sample:

JavaScript

If want remove columns by minimal values dropna working nice with parameter thresh and axis=1 for remove columns:

JavaScript

EDIT: For non-Boolean data

Total number of NaN entries in a column must be less than 80% of total entries:

JavaScript
User contributions licensed under: CC BY-SA
1 People found this is helpful
Advertisement