I have a dataframe with the following structure:
import pandas as pd
df = pd.DataFrame({"a": [1, None, 2], "b": [4, 5, None], "group": ["a", "a", "b"]})
I’d like to know, grouping by group, how many nulls there are in each column.
In this case, the output should be:
group x y 0 a 1 0 1 b 0 1
I don’t have control on how many columns I have or their names. Thanks!
Advertisement
Answer
Convert column group to index, test all another values for misisng values by DataFrame.isna, and for count Trues aggregate sum:
df = df.set_index('group').isna().groupby('group').sum().reset_index()
print(df)
group a b
0 a 1 0
1 b 0 1