Creating time delta diff column based on groupby id

Question

I have the following sample df I want to groupby Id, and get the timedelta difference between the timestamps, i manage to get something similar to the wanted series. Through this code. Although, it is taking quite a long time, is there a way to do it more efficiently? Wanted series Answer here is one way about it btw, if

Accepted Answer

here is one way about itbtw, if you groupby ID, then the desired result you shared is incorrected. the third row should be zero since its a different ID#convert the timeStamp to timestampdf['TimeStamp'] = pd.to_datetime(df['TimeStamp'])# create post_data via vectorization intead of lambda, it'll be fastdf['post_data']=df.groupby('ID')['TimeStamp'].shift(1)#finally, take the differencedf['diff'] = df['TimeStamp'].sub(df['post_data'])df    ID                        TimeStamp                            post_data                     diff0   A   2022-08-02 17:33:44.358000+00:00                                 NaT                        NaT1   A   2022-08-02 17:33:44.600000+00:00    2022-08-02 17:33:44.358000+00:00    0 days 00:00:00.2420002   B   2022-08-02 17:33:44.814000+00:00                                 NaT                        NaT3   B   2022-08-02 17:33:45.028000+00:00    2022-08-02 17:33:44.814000+00:00    0 days 00:00:00.214000

Advertisement

Answer