Dataframe count set of conditions passed by several columns on a per row basis

Question

I have a dataframe which looks something like this: I am trying to compute a value based on a condition for every row which will apply across the column groupings of A, B, C, D, etc. and count how many of those groups passed the condition, for example, some pseudo-code: Expected output: This would mean the ex…

Accepted Answer

Let&#8217;s ignore date and handle the other columns first.Remove date, create a MultiIndex using str.split, and stack to long form:new_df = df.drop(columns='date')new_df.columns = new_df.columns.str.split('_', expand=True)new_df = new_df.stack(level=0)     1  2  30 A  4  5  6  B  2  3  1  C  5  7  2  D  4  3  11 A  3  3  2  B  4  5  2  C  6  2  3  D  2  4  22 A  5  7  5  B  1  3  3  C  4  5  4  D  8  2  23 A  6  1  8  B  6  1  4  C  1  2  7  D  4  3  5Apply the condition row-wise:new_df['condition'] = new_df['1'].gt(3) & new_df['2'].gt(new_df['3'])Then sum level 0 and assign back to the original df:df['count'] = new_df['condition'].sum(level=0)(Alternatively sum the conditions directly rather than assigning to both new_df and df)df['count'] = (new_df['1'].gt(3) & new_df['2'].gt(new_df['3'])).sum(level=0)df:  date  A_1  A_2  A_3  B_1  B_2  B_3  C_1  C_2  C_3  D_1  D_2  D_3  count0  xxx    4    5    6    2    3    1    5    7    2    4    3    1      21  xxx    3    3    2    4    5    2    6    2    3    2    4    2      12  xxx    5    7    5    1    3    3    4    5    4    8    2    2      23  xxx    6    1    8    6    1    4    1    2    7    4    3    5      0Complete Working Example:import pandas as pddf = pd.DataFrame({    'date': ['xxx', 'xxx', 'xxx', 'xxx'], 'A_1': [4, 3, 5, 6],    'A_2': [5, 3, 7, 1], 'A_3': [6, 2, 5, 8],    'B_1': [2, 4, 1, 6], 'B_2': [3, 5, 3, 1],    'B_3': [1, 2, 3, 4], 'C_1': [5, 6, 4, 1],    'C_2': [7, 2, 5, 2], 'C_3': [2, 3, 4, 7],    'D_1': [4, 2, 8, 4], 'D_2': [3, 4, 2, 3],    'D_3': [1, 2, 2, 5]})new_df = df.drop(columns='date')new_df.columns = new_df.columns.str.split('_', expand=True)new_df = new_df.stack(level=0)new_df['condition'] = new_df['1'].gt(3) & new_df['2'].gt(new_df['3'])df['count'] = new_df['condition'].sum(level=0)print(df)

Advertisement

Answer