I am trying to calculate the 14 day rolling average for retail data with multiple different hierarchies. The ‘Store’ dataframe looks like this:
Store | Inventory-Small | Inventory-Medium | Date | Purchases-Small | Purchases-Medium
-----------------------------------------------------------------------------------------------------
A 12 14 4/1/20 2 4
B 13 16 4/1/20 4 5
A 15 10 4/2/20 2 6
C 20 15 4/1/20 4 5
A 16 8 4/3/20 2 4
A 16 10 4/4/20 4 5
A 15 12 4/5/20 1 3
C 18 14 4/2/20 2 3
C 19 12 4/3/20 6 9
B 14 14 4/2/20 3 8
What I am trying to do is create a rolling 14 day average for the purchases column for each store. The data extends well past 14 day (over 8 months), and I would like the first 14 days of each store to be a simple average. My issue is that while I can group by ‘Store’ and create a column, I don’t know how to also group by dates. I’ve tried:
Store.sort_values(['Store','Date'],ascending=(False,False))
Store['Rolling_Purchase_S'] = Store.groupby(['Store','Date'], as_index=False)['Purchases-Small'].transform(lambda x: x.rolling(14, 1).mean())
and also:
Store['Rolling_Purchase_S'] = Store.groupby('Store')['Purchases-Small'].transform(lambda x: x.rolling(14, 1).mean())
The first one doesn’t seem to have any effect while the second one doesn’t group by dates so I end up with a rolling average in the wrong order. Any advice would be much appreciated!
Edit: The following lines worked, thanks to all for the feedback.
Store.sort_values(['Store','Date'],ascending=(False,True),inplace=True)
Store['Rolling_Purchase_S'] = Store.groupby('Store')['Purchases-Small'].transform(lambda x: x.rolling(14, 1).mean())
Advertisement
Answer
I believe it’s working fine as long as you sort inplace and remove ‘Date’ from groupby
:
Store.sort_values(['Store','Date'], ascending=(False,False), inplace=True)
Store['Rolling_Purchase_S'] = Store.groupby(['Store'])['Purchases-Small'].transform(lambda x: x.rolling(14, 1).mean())
Output:
print(Store[['Store', 'Date', 'Purchases-Small', 'Rolling_Purchase_S']])
Store Date Purchases-Small Rolling_Purchase_S
8 C 2020-04-03 6 6.000000
7 C 2020-04-02 2 4.000000
3 C 2020-04-01 4 4.000000
9 B 2020-04-02 3 3.000000
1 B 2020-04-01 4 3.500000
6 A 2020-04-05 1 1.000000
5 A 2020-04-04 4 2.500000
4 A 2020-04-03 2 2.333333
2 A 2020-04-02 2 2.250000
0 A 2020-04-01 2 2.200000