I have a dataframe like below
df = pd.DataFrame({'testid':(1,2,1,2,1,2),'Name':('apple','apple','melon','melon','orange','orange'), 'A': (1,2,10,20,5,5), 'B': (1,5,4,2,3,1)})
testid | Name | A | B |
---|---|---|---|
1 | apple | 1 | 1 |
2 | apple | 2 | 5 |
1 | melon | 10 | 4 |
2 | melon | 20 | 2 |
1 | orange | 5 | 3 |
2 | orange | 5 | 1 |
I want to return a slice of this dataframe ( still a dataframe ) for every testid and Column A and B that if the corresponding apple value is larger than 1 then it returns the corresonding melon value, else return 0. basically I want to get a DataFrame like this
testid | A | B |
---|---|---|
1 | 0 | 0 |
2 | 20 | 2 |
how to achieve this? I tried apply() with lambda x:, but didn’t find a way to put in the dataframe column into the lambda function.
Advertisement
Answer
I believe this should work for you. Replace values corresponding to melon
by according to values corresponding to apple
.
p = df.groupby(['testid','Name']).sum() p.xs('melon', level=1).where(p.xs('apple', level=1)>1, 0).reset_index()