I am computing a DataFrame with weekly amounts and now I need to fill it with missing weeks from a provided date range.
This is how I’m generating the dataframe with the weekly amounts:
df['date'] = pd.to_datetime(df['date']) - timedelta(days=6) weekly_data: pd.DataFrame = (df .groupby([pd.Grouper(key='date', freq='W-SUN')])[data_type] .sum() .reset_index() )
Which outputs:
date sum 0 2020-10-11 78 1 2020-10-18 673
If a date range is given as start='2020-08-30'
and end='2020-10-30'
, then I would expect the following dataframe:
date sum 0 2020-08-30 0.0 1 2020-09-06 0.0 2 2020-09-13 0.0 3 2020-09-20 0.0 4 2020-09-27 0.0 5 2020-10-04 0.0 6 2020-10-11 78 7 2020-10-18 673 8 2020-10-25 0.0
So far, I have managed to just add the missing weeks and set the sum to 0, but it also replaces the existing values:
weekly_data = weekly_data.reindex(pd.date_range('2020-08-30', '2020-10-30', freq='W-SUN')).fillna(0)
Which outputs:
date sum 0 2020-08-30 0.0 1 2020-09-06 0.0 2 2020-09-13 0.0 3 2020-09-20 0.0 4 2020-09-27 0.0 5 2020-10-04 0.0 6 2020-10-11 0.0 # should be 78 7 2020-10-18 0.0 # should be 673 8 2020-10-25 0.0
Advertisement
Answer
Remove reset_index
for DatetimeIndex
, because reindex
working with index and if RangeIndex
get 0
values, because no match:
weekly_data = (df.groupby([pd.Grouper(key='date', freq='W-SUN')])[data_type] .sum() )
Then is possible use fill_value=0
parameter and last add reset_index
:
r = pd.date_range('2020-08-30', '2020-10-30', freq='W-SUN', name='date') weekly_data = weekly_data.reindex(r, fill_value=0).reset_index() print (weekly_data) date sum 0 2020-08-30 0 1 2020-09-06 0 2 2020-09-13 0 3 2020-09-20 0 4 2020-09-27 0 5 2020-10-04 0 6 2020-10-11 78 7 2020-10-18 673 8 2020-10-25 0