I have used the “groupby” method on my dataframe to find the total number of people at each location.
To the right of the “sum” column, I need to add a column that lists all of the people’s names at each location (ideally in separate rows, but a list would be fine too).
Is there a way to “ungroup” my dataframe again after having found the sum?
dataframe.groupby(by=['location'], as_index=False)['people'].agg('sum')
Advertisement
Answer
You can do two different things:
(1) Create an aggregate DataFrame using groupby.agg
and calling appropriate methods. The code below lists all names corresponding to a location:
out = dataframe.groupby(by=['location'], as_index=False).agg({'people':'sum', 'name':list})
(2) Use groupby.transform
to add a new column to dataframe
that has the sum of people by location in each row:
dataframe['sum'] = dataframe.groupby(by=['location'])['people'].transform('sum')