Hi I have a data frame which looks like this
JavaScript
x
8
1
col1 col2
2
0 A 1
3
1 B 2
4
2 C 3
5
3 A 4
6
4 C 5
7
5 A 6
8
I would like to groupby and sum for non repeating values in col1 for e.g.
JavaScript
1
4
1
A,B,C => 6
2
A,C => 9
3
A => 6
4
Is there any way I can do this via pandas functions?
Advertisement
Answer
IIUC, you could create groups using groupby
+ cumcount
(where the nth occurrences of each col1
value will be grouped the same); then groupby the groups and join
“col1″s and sum
“col2″s:
JavaScript
1
2
1
out = df.groupby(df.groupby('col1').cumcount()).agg({'col1':','.join, 'col2':'sum'})
2
Output:
JavaScript
1
5
1
col1 col2
2
0 A,B,C 6
3
1 A,C 9
4
2 A 6
5