I am trying to calculate the 14 day rolling average for retail data with multiple different hierarchies. The ‘Store’ dataframe looks like this:
Store | Inventory-Small | Inventory-Medium | Date | Purchases-Small | Purchases-Medium ----------------------------------------------------------------------------------------------------- A 12 14 4/1/20 2 4 B 13 16 4/1/20 4 5 A 15 10 4/2/20 2 6 C 20 15 4/1/20 4 5 A 16 8 4/3/20 2 4 A 16 10 4/4/20 4 5 A 15 12 4/5/20 1 3 C 18 14 4/2/20 2 3 C 19 12 4/3/20 6 9 B 14 14 4/2/20 3 8
What I am trying to do is create a rolling 14 day average for the purchases column for each store. The data extends well past 14 day (over 8 months), and I would like the first 14 days of each store to be a simple average. My issue is that while I can group by ‘Store’ and create a column, I don’t know how to also group by dates. I’ve tried:
Store.sort_values(['Store','Date'],ascending=(False,False)) Store['Rolling_Purchase_S'] = Store.groupby(['Store','Date'], as_index=False)['Purchases-Small'].transform(lambda x: x.rolling(14, 1).mean())
and also:
Store['Rolling_Purchase_S'] = Store.groupby('Store')['Purchases-Small'].transform(lambda x: x.rolling(14, 1).mean())
The first one doesn’t seem to have any effect while the second one doesn’t group by dates so I end up with a rolling average in the wrong order. Any advice would be much appreciated!
Edit: The following lines worked, thanks to all for the feedback.
Store.sort_values(['Store','Date'],ascending=(False,True),inplace=True) Store['Rolling_Purchase_S'] = Store.groupby('Store')['Purchases-Small'].transform(lambda x: x.rolling(14, 1).mean())
Advertisement
Answer
I believe it’s working fine as long as you sort inplace and remove ‘Date’ from groupby
:
Store.sort_values(['Store','Date'], ascending=(False,False), inplace=True) Store['Rolling_Purchase_S'] = Store.groupby(['Store'])['Purchases-Small'].transform(lambda x: x.rolling(14, 1).mean())
Output:
print(Store[['Store', 'Date', 'Purchases-Small', 'Rolling_Purchase_S']]) Store Date Purchases-Small Rolling_Purchase_S 8 C 2020-04-03 6 6.000000 7 C 2020-04-02 2 4.000000 3 C 2020-04-01 4 4.000000 9 B 2020-04-02 3 3.000000 1 B 2020-04-01 4 3.500000 6 A 2020-04-05 1 1.000000 5 A 2020-04-04 4 2.500000 4 A 2020-04-03 2 2.333333 2 A 2020-04-02 2 2.250000 0 A 2020-04-01 2 2.200000