I’ve got a dataframe with column ‘date’, containing yyyy-mm-dd 00:00:00. Python automatically assigned first day of every quarterly month, but I want last day of same month.
I’ve tried:
cr_ind_ml_new = cr_ind_ml.replace(to_replace= '07-01', value='07-31', regex=True)
and
cr_ind_ml.replace('07-01', '07-31', inplace= True)
without any success.
sample of df:
| date | value |
|---|---|
| 1990-07-01 00:00:00 | 46.7 |
| 1990-10-01 00:00:00 | 54.2 |
| 1991-01-01 00:00:00 | 38.6 |
| 1991-04-01 00:00:00 | 20 |
| 1991-07-01 00:00:00 | 18.6 |
I want:
| date | value |
|---|---|
| 1990-07-31 00:00:00 | 46.7 |
| 1990-10-30 00:00:00 | 54.2 |
| 1991-01-31 00:00:00 | 38.6 |
| 1991-04-30 00:00:00 | 20 |
| 1991-07-31 00:00:00 | 18.6 |
Does anyone have thoughts?
Advertisement
Answer
If your df
date value 0 1990-07-01 00:00:00 46.7 1 1990-10-01 00:00:00 54.2 2 1991-01-01 00:00:00 38.6 3 1991-04-01 00:00:00 20.0 4 1991-07-01 00:00:00 18.6
has a column date with strings, then you can use pd.to_datetime() and the MonthEnd() offset to do
df['date'] = pd.to_datetime(df['date']) + pd.offsets.MonthEnd()
which gives you
date value 0 1990-07-31 46.7 1 1990-10-31 54.2 2 1991-01-31 38.6 3 1991-04-30 20.0 4 1991-07-31 18.6
But: The elements of date are now of type datetime. If the items in date have already type datetime, then you don’t need the pd.to_datetime() part – just adding the offset is enough.
If you want the expected output as strings, then this (starting from the original)
df['date'] = (
pd.to_datetime(df['date']) + pd.offsets.MonthEnd()
).dt.strftime('%Y-%m-%d %H:%M:%S')
gives you
date value 0 1990-07-31 00:00:00 46.7 1 1990-10-31 00:00:00 54.2 2 1991-01-31 00:00:00 38.6 3 1991-04-30 00:00:00 20.0 4 1991-07-31 00:00:00 18.6