I have a pandas dataframe column named disbursal_date which is a datetime:
disbursal_date 2009-01-28 2008-01-03 2008-07-15
and so on…
I want to keep the date and month part and replace the years by 2022 for all values.
I tried using df['disbursal_date'].map(lambda x: x.replace(year=2022)) but this didn’t work for me.
Advertisement
Answer
- You need to use apply not map to run a python function on a dataframe columns.
- We need to make sure that the dtype is datetime of pandas and not object or string.
Below is the sample code I tried and it works fine, it replaces the year to 2022.
df = pd.DataFrame(['2009-01-28', '2008-01-03', '2008-07-15'],columns=['disbursal_old'])
df['disbursal_old'] = df['disbursal_old'].astype('datetime64[ns]')
df['disbursal_new'] = df['disbursal_old'].apply(lambda x : x.replace(year=2022))
print(df['disbursal_new'])
0   2022-01-28
1   2022-01-03
2   2022-07-15
Name: disbursal_new, dtype: datetime64[ns]
The below code gives the difference between the years.
df['disbursal_diff_year'] = df['disbursal_new'].dt.year - df['disbursal_old'].dt.year print(df) disbursal_old disbursal_new disbursal_diff_year 0 2009-01-28 2022-01-28 13 1 2008-01-03 2022-01-03 14 2 2008-07-15 2022-07-15 14