I have this Dataframe
 df = pd.DataFrame({"A": [1, 1, 1, 1, 1, 2, 2, 2, 3], "B": [1, 4, 5, 6, 10, 7, 8, 9, 3], "C": ["Hello", "World", "How", "are", "you", "today", "miss", "?", "!"]})
    A    B      C
0  a1   a1  Hello
1  a1   a4  World
2  a1   a5    How
3  a1   a6    are
4  a1  a10    you
5  a2   a7  today
6  a2   a8   miss
7  a2   a9      ?
8  a3   a3      !
And I want something like this
A B C n 0 a1 a1 Hello 4 1 a1 a4 World 4 2 a1 a5 How 4 3 a1 a6 are 4 4 a1 a10 you 4 5 a2 a7 today 3 6 a2 a8 miss 3 7 a2 a9 ? 3 8 a3 a3 ! 0
I tried this operation
df["n"] = df.loc[df.A != df.B].groupby("A")["B"].transform(len)
But I have this result
A B C n 0 a1 a1 Hello NaN 1 a1 a4 World 4 2 a1 a5 How 4 3 a1 a6 are 4 4 a1 a10 you 4 5 a2 a7 today 3 6 a2 a8 miss 3 7 a2 a9 ? 3 8 a3 a3 ! NaN
Do you know i could set my condition df.A != df.B on the transform instead on the original dataframe ?
Thanks
Advertisement
Answer
For count matched values (Trues) is possible pass mask with sum, Trues are processing like 1 and Falses like 0:
df["n"] = (df.A != df.B).groupby(df["A"]).transform('sum')
    
print (df)
   A   B      C  n
0  1   1  Hello  4
1  1   4  World  4
2  1   5    How  4
3  1   6    are  4
4  1  10    you  4
5  2   7  today  3
6  2   8   miss  3
7  2   9      ?  3
8  3   3      !  0
Or create helper column:
df["n"] = df.assign(B = df.A != df.B).groupby("A")['B'].transform('sum')
    
print (df)
   A   B      C  n
0  1   1  Hello  4
1  1   4  World  4
2  1   5    How  4
3  1   6    are  4
4  1  10    you  4
5  2   7  today  3
6  2   8   miss  3
7  2   9      ?  3
8  3   3      !  0