Can pandas perform an aggregating operation involving two columns?

Question

Given the following dataframe, is it possible to calculate the sum of col2 and the sum of col2 + col3, in a single aggregating function? . col1 col2 col3 0 a 1 10 1 a 2 20 2 b 3 30 3 b 4 40 In R's dplyr I would do it with a single line of summarize, and I

Accepted Answer

Let us try assign the new column firstout = df.assign(col23 = df.col2+df.col3).groupby('col1',as_index=False).sum()Out[81]:  col1  col2  col3  col230    a     3    30     331    b     7    70     77From my understanding the apply is more like the summarize in Rout = df.groupby('col1').           apply(lambda x : pd.Series({'col2_sum':x['col2'].sum(),                                       'col23_sum':(x['col2'] + x['col3']).sum()})).               reset_index()Out[83]:   col1  col2_sum  col23_sum0    a         3         331    b         7         77

Advertisement

Answer