Skip to content
Advertisement

Ungrouping a pandas dataframe after aggregation operation

I have used the “groupby” method on my dataframe to find the total number of people at each location.

To the right of the “sum” column, I need to add a column that lists all of the people’s names at each location (ideally in separate rows, but a list would be fine too).

Is there a way to “ungroup” my dataframe again after having found the sum?

 dataframe.groupby(by=['location'], as_index=False)['people'].agg('sum')

Advertisement

Answer

You can do two different things:

(1) Create an aggregate DataFrame using groupby.agg and calling appropriate methods. The code below lists all names corresponding to a location:

out = dataframe.groupby(by=['location'], as_index=False).agg({'people':'sum', 'name':list})

(2) Use groupby.transform to add a new column to dataframe that has the sum of people by location in each row:

dataframe['sum'] = dataframe.groupby(by=['location'])['people'].transform('sum')
User contributions licensed under: CC BY-SA
6 People found this is helpful
Advertisement