Skip to content
Advertisement

Replacing day in date with last day of month

I’ve got a dataframe with column ‘date’, containing yyyy-mm-dd 00:00:00. Python automatically assigned first day of every quarterly month, but I want last day of same month.

I’ve tried:

cr_ind_ml_new = cr_ind_ml.replace(to_replace= '07-01', value='07-31', regex=True)

and

cr_ind_ml.replace('07-01', '07-31', inplace= True)

without any success.

sample of df:

date value
1990-07-01 00:00:00 46.7
1990-10-01 00:00:00 54.2
1991-01-01 00:00:00 38.6
1991-04-01 00:00:00 20
1991-07-01 00:00:00 18.6

I want:

date value
1990-07-31 00:00:00 46.7
1990-10-30 00:00:00 54.2
1991-01-31 00:00:00 38.6
1991-04-30 00:00:00 20
1991-07-31 00:00:00 18.6

Does anyone have thoughts?

Advertisement

Answer

If your df

                  date  value
0  1990-07-01 00:00:00   46.7
1  1990-10-01 00:00:00   54.2
2  1991-01-01 00:00:00   38.6
3  1991-04-01 00:00:00   20.0
4  1991-07-01 00:00:00   18.6

has a column date with strings, then you can use pd.to_datetime() and the MonthEnd() offset to do

df['date'] = pd.to_datetime(df['date']) + pd.offsets.MonthEnd()

which gives you

        date  value
0 1990-07-31   46.7
1 1990-10-31   54.2
2 1991-01-31   38.6
3 1991-04-30   20.0
4 1991-07-31   18.6

But: The elements of date are now of type datetime. If the items in date have already type datetime, then you don’t need the pd.to_datetime() part – just adding the offset is enough.

If you want the expected output as strings, then this (starting from the original)

df['date'] = (
    pd.to_datetime(df['date']) + pd.offsets.MonthEnd()
).dt.strftime('%Y-%m-%d %H:%M:%S')

gives you

                 date  value
0  1990-07-31 00:00:00   46.7
1  1990-10-31 00:00:00   54.2
2  1991-01-31 00:00:00   38.6
3  1991-04-30 00:00:00   20.0
4  1991-07-31 00:00:00   18.6
3 People found this is helpful
Advertisement