I have a pandas dataframe with 4 columns – A, B, C,D and E. I want create groups by column A and then find the max of column B and C and then populate E with the corresponding value present in D as shown in example below. The condition here is that Column E is to be populated with 0, if a group in column B has all 1’s.
My expected output
column A column B column C column D Column E AA 1 a 12 15 AA 2 d 13 15 AA 3 b 14 15 AA 3 e 15 15 BB 1 c 15 17 BB 1 d 16 17 BB 2 b 17 17 BB 2 a 18 17 CC 1 a 11 0 CC 1 c 10 0 CC 1 b 11 0 CC 1 e 15 0
My code:
df1 = df1.sort_values(by=['column A', 'column B','column C','column D'],ascending=[True,False, False,False]) col_e_df = df1.groupby(['column A']).agg({'column D':['first']}) col_e__df = tranform_aggregated_data(col_e_df, {'first': 'column D'}) #print(col_e_df.sort_index()) df1 = df1.merge(col_e__df, on = ['column A'])
This code works fine when there is number greater than 2, but when when column B has all 1 for any group, it is not populating 0’s(which is what I need to do). When I add the if condition , the group with number greater than 2 also shows 0.
The line of code where it is getting messed up
col_e_df = (np.where(df1['column B'] >= 2, df1.groupby(['column A'])['columnD'].transform('first') , 0)) The actual output from my code is column A column B column C column D Column E AA 1 a 12 0 AA 2 d 13 15 AA 3 b 14 15 AA 3 e 15 15 BB 1 c 15 0 BB 1 d 16 0 BB 2 b 17 17 BB 2 a 18 17 CC 1 a 11 0 CC 1 c 10 0 CC 1 b 11 0 CC 1 e 15 0
Advertisement
Answer
Try:
df = df.sort_values( by=["column A", "column B", "column C", "column D"], ascending=[True, False, False, False], ) df["Column E"] = df.groupby("column A")["column D"].transform("first") # set "Column E" to 0 if all values in "column B" == 1 df["Column E"] = np.where( df.groupby("column A")["column B"].transform(lambda x: x.eq(1).all()), 0, df["Column E"], ) print(df.sort_index())
Prints:
column A column B column C column D Column E 0 AA 1 a 12 15 1 AA 2 d 13 15 2 AA 3 b 14 15 3 AA 3 e 15 15 4 BB 1 c 15 17 5 BB 1 d 16 17 6 BB 2 b 17 17 7 BB 2 a 18 17 8 CC 1 a 11 0 9 CC 1 c 10 0 10 CC 1 b 11 0 11 CC 1 e 15 0