I want to create rolling slices of a master dataframe. I’m trying to measure the difference in outcomes over rolling periods. The master dataframe has 120 years of data and I want to create rolling slices of 10 years of a column(s), i.e slice one goes from year 1 to 10, slice 2 goes from year 2 to 11, etc…
I’m trying to create one slice dataframe that contains all the slices, 12 rolling 10 year periods in my example.
I’m trying to do it in a for loop like this:
length_of_slice = 10
slice_df = pd.DataFrame(index=range(length_of_slice))
for i in range(0, len(slice_df)):
slice_df['Data'+ str(i)] = master_df.loc[i:(i+9)]['Data'].to_list()
So, the first run through this should give me row 0 to 9 of master _df in the 10 rows of slice_df. The second time through the loop should give me row 1 to 10 of master_df in the 10 rows of slice_df.
I keep getting a ValueError: Length of values does not match length of index. Can I use the index of a for loop in .loc like this? Is there a better approach to this?
If I do the steps in sequential order outside a for loop it works.
slice_df['Data1'] = master_df.loc[0:9]['Data'].to_list()
slice_df['Data2'] = master_df.loc[1:10]['Data'].to_list()
So, it’s something I’m doing in the for loop.
Advertisement
Answer
One option is to iterate over the output of rolling
.
import numpy as np
import pandas as pd
n = 10
master_df = pd.DataFrame({"Data": np.random.randint(0, 99, size=20)})
slice_df = pd.DataFrame([list(i) for i in master_df["Data"].rolling(n)]).dropna().T
slice_df.columns = [f"Data{i}" for i, _ in enumerate(slice_df, start=1)]
will give you e.g.
Data1 Data2 Data3 Data4 Data5 Data6 Data7 Data8 Data9 Data10 Data11
0 31.0 84.0 85.0 7.0 71.0 58.0 88.0 83.0 17.0 98.0 57.0
1 84.0 85.0 7.0 71.0 58.0 88.0 83.0 17.0 98.0 57.0 23.0
2 85.0 7.0 71.0 58.0 88.0 83.0 17.0 98.0 57.0 23.0 74.0
3 7.0 71.0 58.0 88.0 83.0 17.0 98.0 57.0 23.0 74.0 75.0
4 71.0 58.0 88.0 83.0 17.0 98.0 57.0 23.0 74.0 75.0 76.0
5 58.0 88.0 83.0 17.0 98.0 57.0 23.0 74.0 75.0 76.0 30.0
6 88.0 83.0 17.0 98.0 57.0 23.0 74.0 75.0 76.0 30.0 93.0
7 83.0 17.0 98.0 57.0 23.0 74.0 75.0 76.0 30.0 93.0 96.0
8 17.0 98.0 57.0 23.0 74.0 75.0 76.0 30.0 93.0 96.0 71.0
9 98.0 57.0 23.0 74.0 75.0 76.0 30.0 93.0 96.0 71.0 58.0