Skip to content
Advertisement

How to calculate 12 month rolling sum based on groupby?

I am trying to calculate the 12 month rolling sum for the number of orders and revenue based on a person’s name using Python for the following dataframe:

df = pd.DataFrame( columns = ['Name','Month','Orders','Revenue']) 
df['Name'] = ['Ava','Ava','Ava','Baptiste','Baptiste','Baptiste','Baptiste','Cam','Cam']
df['Month'] =['2019-07','2020-04','2021-03','2020-03','2020-10','2021-03','2021-05','2020-01','2021-03'] 
df['Orders'] =[2,12,4,5,9,10,3,4,5]
df['Revenue'] = [10,25,15,5,8,10,12,20,5]

In order to give the following output:

df['Rolling 12 Month Orders'] = [2,14,16,5,14,24,22,4,5]
df['Rolling 12 Month Revenue'] = [10,35,40,5,13,23,30,20,5]

The rolling sum should add up all the totals in the past 12 months grouped by the name column.

I have tried the following:

df['Rolling 12 Month Orders'] = df.groupby(['Name','Month'])['Orders'].rolling(12).sum()

but it does not give me the expected result.

Advertisement

Answer

You’re passing the rolling frequency as 12, pandas does not know that you want to specify a 12 months window, also you need to make sure that your Month column is identified as a date type column, try this:

df['Month'] = pd.to_datetime(df['Month'])

df['Rolling 12 Month Orders'] = df.groupby(['Name']).rolling('365d', on = 'Month')['Orders'].sum().reset_index( drop = True)

df['Rolling 12 Month Revenue'] = df.groupby(['Name']).rolling('365d', on = 'Month')['Revenue'].sum().reset_index( drop = True)

Also, you need to use 365d instead of 12m, since using 12m would throw an error : ValueError: <12 * MonthEnds> is a non-fixed frequency.

A brief explanation for this is that a rolling window must have a fixed width – “a month” does not have a fixed number of days.

User contributions licensed under: CC BY-SA
1 People found this is helpful
Advertisement