I have the following dataframe
import pandas as pd test = pd.DataFrame({'y':[1,2,3,4,5,6], 'label': ['bottom', 'top','bottom', 'top','bottom', 'top']}) y label 0 1 bottom 1 2 top 2 3 bottom 3 4 top 4 5 bottom 5 6 top
I would like to add a new column, agg_y
, which would be the the max(y)
if label=="bottom"
and min(y)
if label=="top"
. I have tried this
test['min_y'] = test.groupby('label').y.transform('min') test['max_y'] = test.groupby('label').y.transform('max') test['agg_y'] = np.where(test.label == "bottom", test.max_y, test.min_y) test.drop(columns=['min_y', 'max_y'], inplace=True)
which gives the correct result
y label agg_y 0 1 bottom 5 1 2 top 2 2 3 bottom 5 3 4 top 2 4 5 bottom 5 5 6 top 2
I am just looking fora one-liner solution, if possible
Advertisement
Answer
Your solution in one line solution is:
test['agg_y'] = np.where(test.label == "bottom", test.groupby('label').y.transform('max'), test.groupby('label').y.transform('min'))
Solution without groupby
, thank you @ouroboros1:
test['agg_y'] = np.where(test.label == 'bottom', test.loc[test.label.eq('bottom'), 'y'].max(), test.loc[test.label.ne('bottom'), 'y'].min())
Another idea is mapping values, idea is similar like ouroboros1 solution:
d = {'bottom':'max', 'top':'min'} test['agg_y'] = test['label'].map({val:test.loc[test.label.eq(val),'y'].agg(func) for val, func in d.items()}) print (test) y label agg_y 0 1 bottom 5 1 2 top 2 2 3 bottom 5 3 4 top 2 4 5 bottom 5 5 6 top 2