Count the total number of multiple distinct occurrences in the same data frame

Suppose we have the data frame df

    c1  c2  c3  c4  c5  c6
0   'A' 'B' NaN NaN NaN NaN
1   'C' 'D' NaN NaN NaN NaN
2   'A' 'A' 'B' NaN NaN NaN
3   'A' 'B' 'C' NaN NaN NaN
4   NaN NaN NaN NaN NaN NaN

JavaScript
​x
 
    c1  c2  c3  c4  c5  c6
0   'A' 'B' NaN NaN NaN NaN
1   'C' 'D' NaN NaN NaN NaN
2   'A' 'A' 'B' NaN NaN NaN
3   'A' 'B' 'C' NaN NaN NaN
4   NaN NaN NaN NaN NaN NaN
​

I know that to count the number of 'B' I have to use (df == 'B').sum().sum(). Now suppose that I want to count how many elements contained in the list v = ['B', 'C'] there are in the data frame. What could be a way of doing this?

Obviously (df == 'B').sum().sum() + (df == 'C').sum().sum() is okay but I need something more general.

(df.isin(v)).sum().sum() works fine.

Answer

Just stack the dataframe, which will create a series, then you can use isin, and call sum() at last.

>>> df.stack().isin(['B', 'C']).sum()
5

JavaScript
 
>>> df.stack().isin(['B', 'C']).sum()
5
​

Also, using isin directly on the dataframe will work fine calling sum twice:

>>> df.isin(['B', 'C']).sum().sum()
5

JavaScript
 
>>> df.isin(['B', 'C']).sum().sum()
5
​

Advertisement

Answer