I would like to achieve the result below in Python using Pandas.
I tried groupby
and sum
on the id and Group columns using the below:
df.groupby(['id','Group'])['Total'].sum()
I got the first two columns, but I’m not sure how to get the third column (Overall_Total).
How can I do it?
Initial data (before grouping)
id | Group | Time |
---|---|---|
1 | a | 2 |
1 | a | 2 |
1 | a | 1 |
1 | b | 1 |
1 | b | 1 |
1 | c | 1 |
2 | e | 2 |
2 | a | 4 |
2 | e | 1 |
2 | a | 5 |
3 | c | 1 |
3 | e | 4 |
3 | a | 3 |
3 | e | 4 |
3 | a | 2 |
3 | h | 4 |
Advertisement
Answer
Assuming df
is your initial dataframe, please try this:
df_group = df.groupby(['id','group']).sum(['time']).rename(columns={'time':'Total'}) df_group['All_total'] = df_group.groupby(['id'])['Total'].transform('sum')