Count occurrences in last 30 days with Pandas Dataframe

Question

I have a pandas Dataframe with an ID column and a date column (YYYY-MM-DD), ID Date 001 2022-01-01 001 2022-01-04 001 2022-02-07 002 2022-01-02 002 2022-01-03 002 2022-01-28 There may be gaps in the date field, as shown. I would like to have a new column, "occurrences_last_month" where it counts the number of occurrences for each ID in the last

Accepted Answer

First idea is per grouops use Rolling.count with remove first level created by ID:df = df.set_index('Date')df['Ocurrences_last_month'] = (df.groupby('ID')                                 .rolling('30D')                                 .count().sub(1).droplevel(0).astype(int))print (df)            ID  Ocurrences_last_monthDate                                 2022-01-01   1                      02022-01-04   1                      12022-02-07   1                      02022-01-02   2                      02022-01-03   2                      12022-01-28   2                      2EDIT: If possible duplciated values create Series and assign to original DataFrame by DataFrame.join:s = df.groupby('ID').rolling('30D', on='Date')['Date'].count().sub(1).astype(int)df = df.join(s.rename('Ocurrences_last_month'), on=['ID','Date'])print (df)   ID       Date  Ocurrences_last_month0   1 2022-01-01                      01   1 2022-01-04                      12   1 2022-02-07                      03   2 2022-01-02                      04   2 2022-01-03                      15   2 2022-01-28                      2Alternative solution from comments:df = df.merge(s.rename('Ocurrences_last_month'), on=['ID','Date'])

ID	Date
001	2022-01-01
001	2022-01-04
001	2022-02-07
002	2022-01-02
002	2022-01-03
002	2022-01-28

Advertisement

Answer