groupby in pandas with custom function over a subset of rows in each group

Question

I have a pandas DataFrame of the following format: Input: where (version, branch) is a MultiIndex. PROBLEM DESCRIPTION: I want to groupby version and set the values in the column X with branch overall to the sum of the values in the column X for the remaining branches (having the same version), weighted by th…

Accepted Answer

Use:#select overalls onlyoverall = df['N'].xs('overall', level=1)#select all rows without overallsdf1 = df.drop('overall', level=1)#multiple and aggregate sum, divide overalls   s = df1['N'].mul(df1['X']).groupby(level=0).sum().div(overall)#create MultiIndex and assign backdf.loc[pd.IndexSlice[:, 'overall'], 'X'] = pd.concat({'overall':s}).swaplevel(0,1)print (df)                      N         Xversion branch                   v1      overall  2475.0  1.353535        A        1712.5  1.000000        B         257.5  2.000000        C         392.5  2.000000        D         112.5  3.000000v2      overall  2475.0  1.053939        A        2341.5  1.000000        B          95.0  2.000000        C          38.5  2.000000v3      overall  2475.0  1.191919        A        2000.0  1.000000        B         475.0  2.000000v4      overall  2475.0  1.000000        A        2341.5  1.000000        B         133.5  1.000000

Advertisement

Answer