How to get grouped cumulative duration in pandas?

Question

I have the following data: id encounter_key datetime 1 111 2019-04-14 1 111 2019-04-14 1 111 2019-07-18 1 122 2019-09-02 2 211 2019-10-03 2 211 2020-10-03 I want to find the cumulative duration, grouped by id and encounter_key to achieve the following: id encounter_key datetime cum_duration_days 1 111 2019-04-14 0 1 111 2019-04-14 0 1 111 2019-07-18 95 1 122

Accepted Answer

I think this should workdf['cum_duration_days']=df.groupby(['id','encounter_key'])['datetime'].diff()/ np.timedelta64(1, 'D')df['cum_duration_days'].fillna(0)But @enke is right, the output desired seems to have an error on the &#8217;31&#8217; unless you are not showing all the rows for encounter_key=211 /id=2 &#8230;

id	encounter_key	datetime
1	111	2019-04-14
1	111	2019-04-14
1	111	2019-07-18
1	122	2019-09-02
2	211	2019-10-03
2	211	2020-10-03

Advertisement

Answer