I have this Dataframe
JavaScript
x
13
13
1
df = pd.DataFrame({"A": [1, 1, 1, 1, 1, 2, 2, 2, 3], "B": [1, 4, 5, 6, 10, 7, 8, 9, 3], "C": ["Hello", "World", "How", "are", "you", "today", "miss", "?", "!"]})
2
3
A B C
4
0 a1 a1 Hello
5
1 a1 a4 World
6
2 a1 a5 How
7
3 a1 a6 are
8
4 a1 a10 you
9
5 a2 a7 today
10
6 a2 a8 miss
11
7 a2 a9 ?
12
8 a3 a3 !
13
And I want something like this
JavaScript
1
11
11
1
A B C n
2
0 a1 a1 Hello 4
3
1 a1 a4 World 4
4
2 a1 a5 How 4
5
3 a1 a6 are 4
6
4 a1 a10 you 4
7
5 a2 a7 today 3
8
6 a2 a8 miss 3
9
7 a2 a9 ? 3
10
8 a3 a3 ! 0
11
I tried this operation
JavaScript
1
2
1
df["n"] = df.loc[df.A != df.B].groupby("A")["B"].transform(len)
2
But I have this result
JavaScript
1
11
11
1
A B C n
2
0 a1 a1 Hello NaN
3
1 a1 a4 World 4
4
2 a1 a5 How 4
5
3 a1 a6 are 4
6
4 a1 a10 you 4
7
5 a2 a7 today 3
8
6 a2 a8 miss 3
9
7 a2 a9 ? 3
10
8 a3 a3 ! NaN
11
Do you know i could set my condition df.A != df.B
on the transform
instead on the original dataframe ?
Thanks
Advertisement
Answer
For count matched values (True
s) is possible pass mask with sum
, True
s are processing like 1
and False
s like 0
:
JavaScript
1
14
14
1
df["n"] = (df.A != df.B).groupby(df["A"]).transform('sum')
2
3
print (df)
4
A B C n
5
0 1 1 Hello 4
6
1 1 4 World 4
7
2 1 5 How 4
8
3 1 6 are 4
9
4 1 10 you 4
10
5 2 7 today 3
11
6 2 8 miss 3
12
7 2 9 ? 3
13
8 3 3 ! 0
14
Or create helper column:
JavaScript
1
16
16
1
df["n"] = df.assign(B = df.A != df.B).groupby("A")['B'].transform('sum')
2
3
print (df)
4
5
A B C n
6
0 1 1 Hello 4
7
1 1 4 World 4
8
2 1 5 How 4
9
3 1 6 are 4
10
4 1 10 you 4
11
5 2 7 today 3
12
6 2 8 miss 3
13
7 2 9 ? 3
14
8 3 3 ! 0
15
16