I need to calculate the first purchase amount for every client. This is my code:
ticket.groupby([‘user_reference_id’,’total_amount’]).reference_date.min().reset_index()“
And i have this result:
user_reference_id total_amount reference_date*
159 12.77 2019-06-17 159 19.73 2019-06-17 159 21.55 2019-06-18 159 49.15 2019-06-16 159 49.66 2019-06-14
I need it grouped by user_reference_id with the minimum reference_date (first date when a customer made the purchase) and corresponding total_amount. In this case i need the next output: reference_date 2019-06-14, user_reference_id 159 and total_amount 49.66
Advertisement
Answer
one option would be:
step 1: sort by id and date
df=df.sort_values(['user_reference_id', 'reference_date'], ascending=True)
Step 2: as df is sorted, get the first value using groupby.agg function.
df_result=df.groupby('user_reference_id',as_index=False).agg({'reference_date':'first', 'total_amount':'first'})