I am using the following code:
JavaScript
x
9
1
sns.displot(
2
data=df.isna().melt(value_name="missing"),
3
y="variable",
4
hue="missing",
5
multiple="fill",
6
height=16
7
)
8
plt.show()
9
to create a heatmap of missing values of the df
. However since my df
has a lot of columns, the chart has to be very tall in order to accommodate all the information. I tried altering the data
argument to be something like this:
data = df[df.columns.values.isna()].isna()
or data = df[df.isna().sum() > 0].isna()
so basically, I want to filter the dataframe to have only columns with at least one missing value. I tried looking for a correct answer but couldn’t find it.
Advertisement
Answer
Nearly there. To select all columns with at least one missing value, use:
JavaScript
1
2
1
df[df.columns[df.isna().any()]]
2
Alternatively, you could use .sum()
and then choose some threshold:
JavaScript
1
3
1
threshold = 0
2
df[df.columns[df.isna().sum() > threshold]]
3
And then append .isna().melt(value_name="missing")
for your data
var.