Skip to content
Advertisement

Pandas dataframe – fillna with last of next month

I’ve been staring at this way too long and I think Ive lost my mind, it really shouldn’t be as complicated as I’m making it.

I have a df:

Date1 Date2
2022-04-01 2022-06-17
2022-04-15 2022-04-15
2022-03-03 NaT
2022-04-22 NaT
2022-05-06 2022-06-06

I want to fill the blanks in ‘Date2’ where it keeps the values from ‘Date2’ if they are present but if ‘Date2’ is NaT then I want it to be the last date of the subsequent month from ‘Date1’.

In the example above, the 2 NaT fields would become:

Date1 Date2
2022-03-03 2022-04-30
2022-04-22 2022-05-31

I know I have to use .fillna and the closest I’ve come is this:

df['Date2'] = (df['Date2'].fillna((df['Date1'] + pd.DateOffset(months=1)).replace)).to_numpy().astype('datetime64[M]')

This returns the first of the month. However, it returns the first of the month for all rows (not just NaT rows) and it is returning the first of the month as opposed to the last of the month.

I’m pretty sure my parenthesis are messed up and I’ve tried many different combinations of – timedelta and similar.

What am I doing wrong here? TIA!

Advertisement

Answer

Your question can be interpreted in two ways given the provided example.

End of month of the next row’s date 1 (which now does not seem to be what you want)

You need to use pd.offses.MonthEnd and shift

df['Date2'] = (df['Date2']
               .fillna(df['Date1'].add(pd.offsets.MonthEnd())
                                  .shift(-1))
               )

Next month’s end (same row)

If you want the next month end of the same row:

df['Date2'] = (df['Date2']
               .fillna(df['Date1'].add(pd.offsets.MonthEnd(2)))
               )

Output:

       Date1      Date2
0 2022-04-01 2022-06-17
1 2022-04-15 2022-04-15
2 2022-03-03 2022-04-30
3 2022-04-22 2022-05-31
4 2022-05-06 2022-06-06
User contributions licensed under: CC BY-SA
2 People found this is helpful
Advertisement