I have the pandas dataframe,
data = pd.DataFrame([['1001','2020-03-06'],
                     ['1001','2020-04-06'],
                     ['1002','2021-04-02'],
                     ['1003','2022-07-08'],
                     ['1001','2020-09-06'],
                     ['1003','2022-04-04'],
                     ['1002','2021-06-05'],
                     ['1007','2020-09-08'],
                     ['1002','2021-12-07'],
                     ['1003','2022-12-06'],
                     ['1007','2020-02-10'],
                     ], 
                    columns=['Type', 'Date'])
I need to Group by each id and then apply the monthly difference in each id to get monthly frequency number
I tried out : Solution :
data['Date'] = pd.to_datetime(data['Date'])
data['diff'] = data.groupby(['Type'])['Date'].apply(lambda x:(x.max() - x)/np.timedelta64(1, 'M'))
data['diff'] = data['diff'].astype(int) 
data = data.sort_values('Type')
I am expecting the Output dataframe :
Dataframe :         ([['1007','2020-09-08', 0],
                     ['1007','2020-02-10', 7],
                     ['1003','2022-12-06', 0],
                     ['1003','2022-07-08', 5],
                     ['1003','2022-04-04', 3],
                     ['1002','2021-12-07', 0],
                     ['1002','2021-06-05', 6],
                     ['1002','2021-04-02', 2],
                     ['1001','2020-09-06', 0],
                     ['1001','2020-04-06', 5],
                     ['1001','2020-03-06', 1],
                     ], 
                    columns=['Type', 'Date', 'MonthlyFreq'])
Advertisement
Answer
You can use period objects to calculate the number of monthly periods in between 2 dates:
data['Date'] = pd.to_datetime(data['Date'])
data = data.sort_values(['Type', 'Date'], ascending=False)
data['diff'] = (data['Date']
 .dt.to_period('M') # convert to monthly period
 .groupby(data['Type']).diff().mul(-1)
 .apply(lambda x: 0 if pd.isna(x) else x.n)      
)
output:
Type Date diff 7 1007 2020-09-08 0 10 1007 2020-02-10 7 9 1003 2022-12-06 0 3 1003 2022-07-08 5 5 1003 2022-04-04 3 8 1002 2021-12-07 0 6 1002 2021-06-05 6 2 1002 2021-04-02 2 4 1001 2020-09-06 0 1 1001 2020-04-06 5 0 1001 2020-03-06 1