How to use pandas to create a column that stores count of first occurrences on a group-by?

Question

Q1. Given data frame 1, I am trying to get group-by unique new occurrences & another column that gives me existing ID count per month Expected output for unique newly added group-by ID values & for existing sum of ID values Note: Mar-2020 ID_Count is ZERO because ID 1, 2, and 3 were present in previous months. Note: Existing count

Accepted Answer

I think you can do it like this:df['month'] = pd.to_datetime(df['Date'], format='%b-%Y')# Find new IDsdf['new'] = df.groupby('ID').cumcount()==0# Count new IDs by monthdf_ct = df.groupby('month')['new'].sum().to_frame(name='ID_Count')# Count all previous new IDsdf_ct['Existing_cnt'] = df_ct['ID_Count'].shift().cumsum().fillna(0).astype(int) df_ct.index = df_ct.index.strftime('%b-%Y')df_ctOutput:          ID_Count  Existing_cntmonth                           Jan-2020         1             0Feb-2020         2             1Mar-2020         0             3Apr-2020         2             3

Advertisement

Answer