I’ve got a dataframe with column ‘date’, containing yyyy-mm-dd 00:00:00. Python automatically assigned first day of every quarterly month, but I want last day of same month.
I’ve tried:
cr_ind_ml_new = cr_ind_ml.replace(to_replace= '07-01', value='07-31', regex=True)
and
cr_ind_ml.replace('07-01', '07-31', inplace= True)
without any success.
sample of df:
date | value |
---|---|
1990-07-01 00:00:00 | 46.7 |
1990-10-01 00:00:00 | 54.2 |
1991-01-01 00:00:00 | 38.6 |
1991-04-01 00:00:00 | 20 |
1991-07-01 00:00:00 | 18.6 |
I want:
date | value |
---|---|
1990-07-31 00:00:00 | 46.7 |
1990-10-30 00:00:00 | 54.2 |
1991-01-31 00:00:00 | 38.6 |
1991-04-30 00:00:00 | 20 |
1991-07-31 00:00:00 | 18.6 |
Does anyone have thoughts?
Advertisement
Answer
If your df
date value 0 1990-07-01 00:00:00 46.7 1 1990-10-01 00:00:00 54.2 2 1991-01-01 00:00:00 38.6 3 1991-04-01 00:00:00 20.0 4 1991-07-01 00:00:00 18.6
has a column date
with strings, then you can use pd.to_datetime()
and the MonthEnd()
offset to do
df['date'] = pd.to_datetime(df['date']) + pd.offsets.MonthEnd()
which gives you
date value 0 1990-07-31 46.7 1 1990-10-31 54.2 2 1991-01-31 38.6 3 1991-04-30 20.0 4 1991-07-31 18.6
But: The elements of date
are now of type datetime
. If the items in date
have already type datetime
, then you don’t need the pd.to_datetime()
part – just adding the offset is enough.
If you want the expected output as strings, then this (starting from the original)
df['date'] = ( pd.to_datetime(df['date']) + pd.offsets.MonthEnd() ).dt.strftime('%Y-%m-%d %H:%M:%S')
gives you
date value 0 1990-07-31 00:00:00 46.7 1 1990-10-31 00:00:00 54.2 2 1991-01-31 00:00:00 38.6 3 1991-04-30 00:00:00 20.0 4 1991-07-31 00:00:00 18.6