I have been banging my head against a wall for a while now trying to figure out this seemingly easy data manipulation task in Pandas, however I have had no success figuring out how to do it or googling a sufficient answer :(
All I want to do is take the table on the left of the snip below (will be a pandas dataframe) and convert it into the table on the right (to become another pandas dataframe).
Code for creating the initial dataframe:
JavaScript
x
10
10
1
import pandas as pd
2
3
test_data = pd.DataFrame(
4
{
5
'team': [1,1,2,2,3,3,4,4,5,5] ,
6
'player': ['a','b','c','d','e','f','g','h','i','j'] ,
7
'score': [10,22,66,44,1,3,55,6,4,2]
8
}
9
)
10
Thank you for your help in advance!
Advertisement
Answer
try this,
JavaScript
1
2
1
test_data.groupby('team').agg({'player':['first', 'last'], 'score': ['first', 'last']})
2
O/P:
JavaScript
1
8
1
player_first player_last score_first score_last
2
team
3
1 a b 10 22
4
2 c d 66 44
5
3 e f 1 3
6
4 g h 55 6
7
5 i j 4 2
8
Complete solution:
JavaScript
1
5
1
test_data = test_data.groupby('team').agg({'player':['first', 'last'], 'score': ['first', 'last']})
2
test_data.columns = ['_'.join(x) for x in test_data.columns]
3
test_data = test_data.reset_index()
4
test_data = test_data[['team', 'player_first', 'score_first', 'player_last', 'score_last']]
5
O/P:
JavaScript
1
7
1
team player_first score_first player_last score_last
2
0 1 a 10 b 22
3
1 2 c 66 d 44
4
2 3 e 1 f 3
5
3 4 g 55 h 6
6
4 5 i 4 j 2•
7
- What you need is groupby and aggregation ops of first and last
- set column names
- reset index and re order columns