Skip to content
Advertisement

Pandas groupby column and sum nulls of all other columns

I have a dataframe with the following structure:

import pandas as pd

df = pd.DataFrame({"a": [1, None, 2], "b": [4, 5, None], "group": ["a", "a", "b"]})

I’d like to know, grouping by group, how many nulls there are in each column.

In this case, the output should be:

  group  x  y
0     a  1  0
1     b  0  1

I don’t have control on how many columns I have or their names. Thanks!

Advertisement

Answer

Convert column group to index, test all another values for misisng values by DataFrame.isna, and for count Trues aggregate sum:

df = df.set_index('group').isna().groupby('group').sum().reset_index()

print(df)
  group  a  b
0     a  1  0
1     b  0  1
User contributions licensed under: CC BY-SA
7 People found this is helpful
Advertisement