I am computing a DataFrame with weekly amounts and now I need to fill it with missing weeks from a provided date range.
This is how I’m generating the dataframe with the weekly amounts:
JavaScript
x
7
1
df['date'] = pd.to_datetime(df['date']) - timedelta(days=6)
2
weekly_data: pd.DataFrame = (df
3
.groupby([pd.Grouper(key='date', freq='W-SUN')])[data_type]
4
.sum()
5
.reset_index()
6
)
7
Which outputs:
JavaScript
1
4
1
date sum
2
0 2020-10-11 78
3
1 2020-10-18 673
4
If a date range is given as start='2020-08-30'
and end='2020-10-30'
, then I would expect the following dataframe:
JavaScript
1
11
11
1
date sum
2
0 2020-08-30 0.0
3
1 2020-09-06 0.0
4
2 2020-09-13 0.0
5
3 2020-09-20 0.0
6
4 2020-09-27 0.0
7
5 2020-10-04 0.0
8
6 2020-10-11 78
9
7 2020-10-18 673
10
8 2020-10-25 0.0
11
So far, I have managed to just add the missing weeks and set the sum to 0, but it also replaces the existing values:
JavaScript
1
2
1
weekly_data = weekly_data.reindex(pd.date_range('2020-08-30', '2020-10-30', freq='W-SUN')).fillna(0)
2
Which outputs:
JavaScript
1
11
11
1
date sum
2
0 2020-08-30 0.0
3
1 2020-09-06 0.0
4
2 2020-09-13 0.0
5
3 2020-09-20 0.0
6
4 2020-09-27 0.0
7
5 2020-10-04 0.0
8
6 2020-10-11 0.0 # should be 78
9
7 2020-10-18 0.0 # should be 673
10
8 2020-10-25 0.0
11
Advertisement
Answer
Remove reset_index
for DatetimeIndex
, because reindex
working with index and if RangeIndex
get 0
values, because no match:
JavaScript
1
4
1
weekly_data = (df.groupby([pd.Grouper(key='date', freq='W-SUN')])[data_type]
2
.sum()
3
)
4
Then is possible use fill_value=0
parameter and last add reset_index
:
JavaScript
1
14
14
1
r = pd.date_range('2020-08-30', '2020-10-30', freq='W-SUN', name='date')
2
weekly_data = weekly_data.reindex(r, fill_value=0).reset_index()
3
print (weekly_data)
4
date sum
5
0 2020-08-30 0
6
1 2020-09-06 0
7
2 2020-09-13 0
8
3 2020-09-20 0
9
4 2020-09-27 0
10
5 2020-10-04 0
11
6 2020-10-11 78
12
7 2020-10-18 673
13
8 2020-10-25 0
14