I need to calculate the first purchase amount for every client. This is my code:
ticket.groupby([‘user_reference_id’,’total_amount’]).reference_date.min().reset_index()“
And i have this result:
user_reference_id total_amount reference_date*
JavaScript
x
10
10
1
159 12.77 2019-06-17
2
3
159 19.73 2019-06-17
4
5
159 21.55 2019-06-18
6
7
159 49.15 2019-06-16
8
9
159 49.66 2019-06-14
10
I need it grouped by user_reference_id with the minimum reference_date (first date when a customer made the purchase) and corresponding total_amount. In this case i need the next output: reference_date 2019-06-14, user_reference_id 159 and total_amount 49.66
Advertisement
Answer
one option would be:
step 1: sort by id and date
JavaScript
1
2
1
df=df.sort_values(['user_reference_id', 'reference_date'], ascending=True)
2
Step 2: as df is sorted, get the first value using groupby.agg function.
JavaScript
1
2
1
df_result=df.groupby('user_reference_id',as_index=False).agg({'reference_date':'first', 'total_amount':'first'})
2