Skip to content
Advertisement

Pandas generate numeric sequence for groups in new column

I am working on a data frame as below,

import pandas as pd
df=pd.DataFrame({'A':['A','A','A','B','B','C','C','C','C'],
                'B':['a','a','b','a','b','a','b','c','c'],
                })
    df
    
    A   B
0   A   a
1   A   a
2   A   b
3   B   a
4   B   b
5   C   a
6   C   b
7   C   c
8   C   c

I want to create a new column with the sequence value for Column B subgroups based on Column A groups like below

    A   B   C
0   A   a   1
1   A   a   1
2   A   b   2
3   B   a   1
4   B   b   2
5   C   a   3
6   C   b   1
7   C   c   2
8   C   c   2

I tried this , but does not give me desired output

 df['C'] = df.groupby(['A','B']).cumcount()+1

Advertisement

Answer

IIUC, I think you want something like this:

df['C'] = df.groupby('A')['B'].transform(lambda x: (x != x.shift()).cumsum())

Output:

   A  B  C
0  A  a  1
1  A  a  1
2  A  b  2
3  B  a  1
4  B  b  2
5  C  c  1
6  C  b  2
7  C  c  3
8  C  c  3
User contributions licensed under: CC BY-SA
8 People found this is helpful
Advertisement