Python Pandas, Running Sum, based on previous rows value and grouped

Question

I have a pandas dataframe along these lines, based on where a customer service case sits before being closed. Every time the case is edited and audit trial is captured. I want to generate a counter for each time the Department of a case changes from the department it was previously in. ID Department Start Date End Date A Sales

Accepted Answer

Compare previous and current row in Department per ID then again group by ID and calculate cumsum to generate counterm = df['Department'] != df.groupby('ID')['Department'].shift()df['Dept_Change_Count'] = m.groupby(df['ID']).cumsum() - 1Alternative approach using a single groupby with lambda func to calculate cumsum:df['Dept_Change_Count'] = df.groupby('ID')['Department']                            .apply(lambda s: (s != s.shift()).cumsum()) - 1  ID  Department  Start Date    End Date  Dept_Change_Count0  A       Sales  01/01/2022  02/01/2022                  01  A       Sales  02/01/2022  03/01/2022                  02  A  Operations  03/01/2022  04/01/2022                  13  A       Sales  04/01/2022  05/01/2022                  24  B     Finance  01/01/2022  02/01/2022                  05  B        Risk  02/01/2022  03/01/2022                  1

ID	Department	Start Date	End Date
A	Sales	01/01/2022	02/01/2022
A	Sales	02/01/2022	03/01/2022
A	Operations	03/01/2022	04/01/2022
A	Sales	04/01/2022	05/01/2022
B	Finance	01/01/2022	02/01/2022
B	Risk	02/01/2022	03/01/2022

Advertisement

Answer