Skip to content
Advertisement

How to add rows with identical items in different columns in Pandas together

I have a sample dataframe that looks like below. I’d like to eventually group row 1 and row 3 together, since they contain identical items in different columns.

x    y      count
a,b  b,a      5
a,c  c,a      2
b,a  a,b      1

I’ve spent a lot of time trying to solve this, but have not encountered a good solution yet. What steps should I take to reach the below final dataframe?

x    y      count
a,b  b,a      5+1
a,c  c,a      2

Advertisement

Answer

You can try:

df.groupby((df.x + df.y).str.replace(',', '').apply(lambda x: ''.join(sorted(x)))
           ).agg({'x': 'first', 'y': 'first', 'count': sum}).reset_index(drop=True)

OUTPUT:

     x    y  count
0  a,b  b,a      6
1  a,c  c,a      2
User contributions licensed under: CC BY-SA
1 People found this is helpful
Advertisement