groupby weighted average and sum in pandas dataframe

Question

I have a dataframe: I need a sum of adjusted_lots , price which is weighted average , of price and adjusted_lots , grouped by all the other columns , ie. grouped by (contract, month , year and buys) Similar solution on R was achieved by following code, using dplyr, however unable to do the same in pandas. is the same

Accepted Answer

EDIT: update aggregation so it works with recent version of pandasTo pass multiple functions to a groupby object, you need to pass a tuples with the aggregation functions and the column to which the function applies:# Define a lambda function to compute the weighted mean:wm = lambda x: np.average(x, weights=df.loc[x.index, "adjusted_lots"])# Define a dictionary with the functions to apply for a given column:# the following is deprecated since pandas 0.20:# f = {'adjusted_lots': ['sum'], 'price': {'weighted_mean' : wm} }# df.groupby(["contract", "month", "year", "buys"]).agg(f)# Groupby and aggregate with namedAgg [1]:df.groupby(["contract", "month", "year", "buys"]).agg(adjusted_lots=("adjusted_lots", "sum"),                                                        price_weighted_mean=("price", wm))                          adjusted_lots  price_weighted_meancontract month year buys                                    C        Z     5    Sell            -19           424.828947CC       U     5    Buy               5          3328.000000SB       V     5    Buy              12            11.637500W        Z     5    Sell             -5           554.850000You can see more here:http://pandas.pydata.org/pandas-docs/stable/groupby.html#applying-multiple-functions-at-onceand in a similar question here:Apply multiple functions to multiple groupby columns[1] : https://pandas.pydata.org/pandas-docs/stable/whatsnew/v0.25.0.html#groupby-aggregation-with-relabeling

Advertisement

Answer