I need to do some coding in python and I can’t do this code:
import pandas as pd
df = pd.DataFrame({'Interview':['1','1','2','2','3','3','4','4','5','5'],
'Sequence':['1st','2nd','1st','2nd','1st','2nd','1st','2nd','1st','2nd'],
'Product':['Orange','Banana','Banana','Orange','Apple','Banana','Apple','Apple','Orange','Apple'],
'Value':[6,8,5,4,3,5,7,6,7,9]})
I need to do something like this as result:
sequence = pd.DataFrame({'1st':['Orange','Orange','Orange','Banana','Banana','Banana','Apple','Apple','Apple'],
'2nd':['Orange','Banana','Apple','Orange','Banana','Apple','Orange','Banana','Apple'],
'Value':[0,14,16,9,0,0,0,8,13]})
For me the sequence matters most in my analysis. It’s a sum of the results in interviews.
Thanks guys for the help!
Advertisement
Answer
Here is another approach using reindex and unstack:
df2 = df.set_index(['Interview', 'Sequence']).unstack()
result = df2.Product.join(df2.Value.sum(1).rename("Value"))
# 1st 2nd Value
# Interview
# 1 Orange Banana 14
# 2 Banana Orange 9
# 3 Apple Banana 8
# 4 Apple Apple 13
# 5 Orange Apple 16

