Skip to content
Advertisement

Pandas – Combine multiple group rows into one row

I have been banging my head against a wall for a while now trying to figure out this seemingly easy data manipulation task in Pandas, however I have had no success figuring out how to do it or googling a sufficient answer :(

All I want to do is take the table on the left of the snip below (will be a pandas dataframe) and convert it into the table on the right (to become another pandas dataframe).

enter image description here

Code for creating the initial dataframe:

import pandas as pd

test_data = pd.DataFrame(
        {
            'team': [1,1,2,2,3,3,4,4,5,5] ,
            'player': ['a','b','c','d','e','f','g','h','i','j'] ,
            'score': [10,22,66,44,1,3,55,6,4,2]
        }
)

Thank you for your help in advance!

Advertisement

Answer

try this,

test_data.groupby('team').agg({'player':['first', 'last'], 'score': ['first', 'last']})

O/P:

    player_first player_last  score_first  score_last
team                                                  
1               a           b           10          22
2               c           d           66          44
3               e           f            1           3
4               g           h           55           6
5               i           j            4           2

Complete solution:

test_data = test_data.groupby('team').agg({'player':['first', 'last'], 'score': ['first', 'last']})
test_data.columns = ['_'.join(x) for x in test_data.columns]
test_data = test_data.reset_index()
test_data = test_data[['team', 'player_first', 'score_first', 'player_last', 'score_last']]

O/P:

   team player_first  score_first player_last  score_last
0     1            a           10           b          22
1     2            c           66           d          44
2     3            e            1           f           3
3     4            g           55           h           6
4     5            i            4           j           2​
  • What you need is groupby and aggregation ops of first and last
  • set column names
  • reset index and re order columns
Advertisement