Skip to content
Advertisement

Sum pandas dataframe column values grouped by another column then update row with sum and remove duplicates

I’m trying to sum two columns (in the below example Seasons and Rating) in a pandas df for each Actor in the below example. I then want the totals to be stored per Actor and any other rows containing that Actor to be removed. In the below example the ‘Name’ that is retained or disgarded is not important.

For Example this is the input DF:

import pandas as pd

series = [('Stranger Things', 3, 5, 'Millie'),
          ('Game of Thrones', 8, 8, 'Emilia'), 
          ('La Casa De Papel', 4, 6, 'Sergio'),
          ('Westworld', 3, 7, 'Evan Rachel'), 
          ('Cops', 3, 10,  'Millie'),
          ('La Casa De Papel', 4, 7, 'Sergio')]

# Create a DataFrame object
df = pd.DataFrame(series, columns=['Name', 'Seasons', 'Rating', 'Actor'])

The required output is (again which of the ‘Name’ instances that is retained or discarded is not important but when there is more than one, one of them should be retained):

'Stranger Things', 6, 15, 'Millie'
'Game of Thrones', 8, 8, 'Emilia' 
'La Casa De Papel', 8, 13, 'Sergio'
'Westworld', 3, 7, 'Evan Rachel'

I’ve looked into a few different methods but can’t get anything to work without even thinking about the most efficient way of solving the challenge. The data set here will be around 1.5k to 3k rows and I’m using Python 3.x.

Advertisement

Answer

Group by ‘Actor’, use sum aggregation for ‘Seasons’ and ‘Rating’ columns, and an aggregation returning a single value for ‘Name’ (in this case I use max), then reset index to get ‘Actor’ column back as a distinct column, and re-order the columns in your specified order:

df.groupby('Actor').agg({'Seasons':'sum','Rating':'sum','Name':'max'}).reset_index()[['Name','Seasons','Rating','Actor']]

Outputs:

    Name                Seasons Rating  Actor
0   Game of Thrones     8       8       Emilia
1   Westworld           3       7       Evan Rachel
2   Stranger Things     6       15      Millie
3   La Casa De Papel    8       13      Sergio
User contributions licensed under: CC BY-SA
8 People found this is helpful
Advertisement