How can I calculate the elapsed months using pandas? I have write the following, but this code is not elegant. Could you tell me a better way?
JavaScript
17
1
import pandas as pd
2
3
df = pd.DataFrame([pd.Timestamp('20161011'),
4
pd.Timestamp('20161101') ], columns=['date'])
5
df['today'] = pd.Timestamp('20161202')
6
7
df = df.assign(
8
elapsed_months=(12 *
9
(df["today"].map(lambda x: x.year) -
10
df["date"].map(lambda x: x.year)) +
11
(df["today"].map(lambda x: x.month) -
12
df["date"].map(lambda x: x.month))))
13
# Out[34]:
14
# date today elapsed_months
15
# 0 2016-10-11 2016-12-02 2
16
# 1 2016-11-01 2016-12-02 1
17
Advertisement
Answer
Update for pandas 0.24.0:
Since 0.24.0 has changed the api to return MonthEnd object from period subtraction, you could do some manual calculation as follows to get the whole month difference:
JavaScript
6
1
12 * (df.today.dt.year - df.date.dt.year) + (df.today.dt.month - df.date.dt.month)
2
3
# 0 2
4
# 1 1
5
# dtype: int64
6
Wrap in a function:
JavaScript
8
1
def month_diff(a, b):
2
return 12 * (a.dt.year - b.dt.year) + (a.dt.month - b.dt.month)
3
4
month_diff(df.today, df.date)
5
# 0 2
6
# 1 1
7
# dtype: int64
8
Prior to pandas 0.24.0. You can round the date to Month with to_period()
and then subtract the result:
JavaScript
7
1
df['elapased_months'] = df.today.dt.to_period('M') - df.date.dt.to_period('M')
2
3
df
4
# date today elapased_months
5
#0 2016-10-11 2016-12-02 2
6
#1 2016-11-01 2016-12-02 1
7