I have a dataframe like below
df = pd.DataFrame({'testid':(1,2,1,2,1,2),'Name':('apple','apple','melon','melon','orange','orange'), 'A': (1,2,10,20,5,5), 'B': (1,5,4,2,3,1)})
| testid | Name | A | B | 
|---|---|---|---|
| 1 | apple | 1 | 1 | 
| 2 | apple | 2 | 5 | 
| 1 | melon | 10 | 4 | 
| 2 | melon | 20 | 2 | 
| 1 | orange | 5 | 3 | 
| 2 | orange | 5 | 1 | 
I want to return a slice of this dataframe ( still a dataframe ) for every testid and Column A and B that if the corresponding apple value is larger than 1 then it returns the corresonding melon value, else return 0. basically I want to get a DataFrame like this
| testid | A | B | 
|---|---|---|
| 1 | 0 | 0 | 
| 2 | 20 | 2 | 
how to achieve this? I tried apply() with lambda x:, but didn’t find a way to put in the dataframe column into the lambda function.
Advertisement
Answer
I believe this should work for you. Replace values corresponding to melon by according to values corresponding to apple.
p = df.groupby(['testid','Name']).sum()
p.xs('melon', level=1).where(p.xs('apple', level=1)>1, 0).reset_index()
