Skip to content
Advertisement

Groupby names replace values with there max value in all columns pandas

I have this DataFrame

lst = [['AAA',15,'BBB',20],['BBB',16,'AAA',12],['BBB',22,'CCC',15],['CCC',11,'AAA',31],['DDD',25,'EEE',35]]
df = pd.DataFrame(lst,columns = ['name1','val1','name2','val2'])

which looks like this

 name1   val1 name2 val2
0  AAA     15  BBB   20
1  BBB     16  AAA   12
2  BBB     22  CCC   15
3  CCC     11  AAA   31
4  DDD     25  EEE   35

I want this

 name1   val1 name2  val2
0  AAA     31  BBB    22
1  BBB     22  AAA    31
2  BBB     22  CCC    15
3  CCC     15  AAA    31
4  DDD     25  EEE    35

replaced all values with the maximum value. we choose the maximum value from both val1 and val2

if i do this i will get the maximum from only val1

df["val1"] = df.groupby("name1")["val1"].transform("max")

Advertisement

Answer

Try using pd.wide_to_long to melt that dataframe into a long form, then use groupby with transform to find the max value. Map that max value to ‘name’ and reshape back to four column (wide) dataframe:

df_long = pd.wide_to_long(df.reset_index(), ['name','val'], 'index', j='num',sep='',suffix='d+')
mapper= df_long.groupby('name')['val'].max()
df_long['val'] = df_long['name'].map(mapper)
df_new = df_long.unstack()
df_new.columns = [f'{i}{j}' for i,j in df_new.columns]
df_new

Output:

      name1 name2  val1  val2
index                        
0       AAA   BBB    31    22
1       BBB   AAA    22    31
2       BBB   CCC    22    15
3       CCC   AAA    15    31
4       DDD   EEE    25    35
User contributions licensed under: CC BY-SA
9 People found this is helpful
Advertisement