per user I want an unique item order (as they click through them). If a item already has been seen, then don’t cumulative count, but place the already assigned value there. For example, c,d, g & b in the tables below. I used the function below, but its not getting the job done at the moment. If I add the ‘user_id’ to the grouper I mess up the ngroup(). Can anyone help me with this?
df['Order Number'] = df.groupby(pd.Grouper(key='Item',sort=False)).ngroup()+1 print(df)
Current Output:
User_id Item Order Number 0 1 b 1 1 1 a 2 2 1 c 3 3 1 d 4 4 1 c 3 5 1 d 4 6 1 e 5 7 1 b 1 8 1 f 6 9 1 g 7 10 1 b 1 ----------------------------- 11 2 x 8 12 2 g 7 13 2 g 7 14 2 f 6 15 2 h 9 16 2 i 10 17 2 f 11 18 2 k 12 19 2 l 13
Desired Output:
User_id Item Order Number 0 1 b 1 1 1 a 2 2 1 c 3 3 1 d 4 4 1 c 3 5 1 d 4 6 1 e 5 7 1 b 1 8 1 f 6 9 1 g 7 10 1 b 1 ----------------------------- 11 2 x 1 12 2 g 2 13 2 g 2 14 2 f 3 15 2 h 4 16 2 i 5 17 2 f 3 18 2 k 7 19 2 l 8
Advertisement
Answer
Use GroupBy.transform
with factorize
in lambda function:
df['Order Number'] = df.groupby('User_id')['Item'].transform(lambda x: pd.factorize(x)[0])+1 print (df) User_id Item Order Number 0 1 b 1 1 1 a 2 2 1 c 3 3 1 d 4 4 1 c 3 5 1 d 4 6 1 e 5 7 1 b 1 8 1 f 6 9 1 g 7 10 1 b 1 11 2 x 1 12 2 g 2 13 2 g 2 14 2 f 3 15 2 h 4 16 2 i 5 17 2 f 3 18 2 k 6 19 2 l 7