Skip to content
Advertisement

Pandas fill missing dates and values simultaneously for each group

I have a dataframe (mydf) with dates for each group in monthly frequency like below:

JavaScript

I want to fill the dt for each group till the Maximum date within the date column starting from the date of Id while simultaneously filling in 0 for the Sales column. So each group starts at their own start date but ends at the same end date.

So for e.g. ID=A will start from 2020-10-01 and go all the way to 2021-06-03 and the value for the filled dates will be 0.

So the output will be

JavaScript

I have tried reindex but instead of adding daterange manually I want to use the dates in the groups.

My code is :

JavaScript

Advertisement

Answer

Let’s try:

  1. Getting the minimum value per group using groupby.min
  2. Add a new column to the aggregated mins called max which stores the maximum values from the frame using Series.max on Dt
  3. Create individual date_range per group based on the min and max values
  4. Series.explode into rows to have a DataFrame that represents the new index.
  5. Create a MultiIndex.from_frame to reindex the DataFrame with.
  6. reindex with midx and set the fillvalue=0
JavaScript

mydf:

JavaScript

DataFrame:

JavaScript
User contributions licensed under: CC BY-SA
2 People found this is helpful
Advertisement