I have a list of dates to be used in a time series. And the list looks like as follow:
x = [None, None, '2019-07-01', None, '2019-09-01', None]
I want to fill up the missing dates with what they were actually supposed to be. For example, here in the list x
, x[2] = '2019-07-01'
that represents the month of July, and the previous two elements are type None
, the previous two elements will be replaced by ‘2019-06-01’ and '2019-05-01'
respectively. The same concept will be followed for later elements. Finally, the updated list will as follow:
x = ['2019-05-01', '2019-06-01', '2019-07-01', '2019-08-01', '2019-09-01', '2019-10-01']
Advertisement
Answer
The basic idea is you take the first month + associated index in the list and compare all other index positions relative to that “fixed” month.
Using the function relativedelta()
from Python’s dateutil
package, the current index in the list, and the index of the fixed month, you can add or subtract from the fixed month to get the appropriate month for that slot in the list.
Unlike the other answer, this does not make assumptions on how many “empty slots” there are before the first month in the list.
import datetime from dateutil.relativedelta import relativedelta from typing import List, Union def fill_months(dates: List[Union[str, None]]) -> List[str]: try: # Get the first month in the list. We will use this to compare all # the other months in the list relative to this date / index pos. fixed_i, fixed_m = [(i, datetime.datetime.strptime(dates[i], "%Y-%m-%d").date()) for i in range(len(dates)) if dates[i]][0] # Loop through all items in the list. If any are set to None, # calculate the month using relativedelta() and update the list for i in range(len(dates)): if dates[i] != None: continue if i < fixed_i: month = fixed_m + relativedelta(months=-fixed_i + i) dates[i] = str(month) if i > fixed_i: month = fixed_m + relativedelta(months=i - fixed_i) dates[i] = str(month) return dates except Exception: raise dates = [None, None, '2019-07-01', None, '2019-09-01', None] print(fill_months(dates))
Outputs
['2019-05-01', '2019-06-01', '2019-07-01', '2019-08-01', '2019-09-01', '2019-10-01']