per user I want an unique item order (as they click through them). If a item already has been seen, then don’t cumulative count, but place the already assigned value there. For example, c,d, g & b in the tables below. I used the function below, but its not getting the job done at the moment. If I add the ‘user_id’ to the grouper I mess up the ngroup(). Can anyone help me with this?
JavaScript
x
4
1
df['Order Number'] = df.groupby(pd.Grouper(key='Item',sort=False)).ngroup()+1
2
3
print(df)
4
Current Output:
JavaScript
1
23
23
1
User_id Item Order Number
2
0 1 b 1
3
1 1 a 2
4
2 1 c 3
5
3 1 d 4
6
4 1 c 3
7
5 1 d 4
8
6 1 e 5
9
7 1 b 1
10
8 1 f 6
11
9 1 g 7
12
10 1 b 1
13
-----------------------------
14
11 2 x 8
15
12 2 g 7
16
13 2 g 7
17
14 2 f 6
18
15 2 h 9
19
16 2 i 10
20
17 2 f 11
21
18 2 k 12
22
19 2 l 13
23
Desired Output:
JavaScript
1
23
23
1
User_id Item Order Number
2
0 1 b 1
3
1 1 a 2
4
2 1 c 3
5
3 1 d 4
6
4 1 c 3
7
5 1 d 4
8
6 1 e 5
9
7 1 b 1
10
8 1 f 6
11
9 1 g 7
12
10 1 b 1
13
-----------------------------
14
11 2 x 1
15
12 2 g 2
16
13 2 g 2
17
14 2 f 3
18
15 2 h 4
19
16 2 i 5
20
17 2 f 3
21
18 2 k 7
22
19 2 l 8
23
Advertisement
Answer
Use GroupBy.transform
with factorize
in lambda function:
JavaScript
1
24
24
1
df['Order Number'] = df.groupby('User_id')['Item'].transform(lambda x: pd.factorize(x)[0])+1
2
print (df)
3
User_id Item Order Number
4
0 1 b 1
5
1 1 a 2
6
2 1 c 3
7
3 1 d 4
8
4 1 c 3
9
5 1 d 4
10
6 1 e 5
11
7 1 b 1
12
8 1 f 6
13
9 1 g 7
14
10 1 b 1
15
11 2 x 1
16
12 2 g 2
17
13 2 g 2
18
14 2 f 3
19
15 2 h 4
20
16 2 i 5
21
17 2 f 3
22
18 2 k 6
23
19 2 l 7
24