Skip to content
Advertisement

How to find the most frequent appearence in one column for different values in a different column of a grouped dataframe?

The question is not so clear I guess, so here is an example: given a dataframe:

company_name company_size company_acitivity
7 eleven 5 restaurant
7 eleven 5 supermarket
7 eleven 10 supermarket
goldman sachs 100 bank
goldman sachs 200 restaurant
goldman sachs 200 bank

I want to group the dataframe by company name and then replace the values in the organization_size and organization_acitivity columns with the values that have the highest occurrence for the respective company and column.

So in the end the dataframe should look like this:

company_name company_size company_acitivity
7 eleven 5 supermarket
goldman sachs 200 bank

I tried this:

JavaScript

But it only gives me

“AttributeError: ‘SeriesGroupBy’ object has no attribute ‘mode'”.

Does someone have an idea for an easier way to do this?

Advertisement

Answer

You don’t want to select a column after the groupby, since you want to apply that to all the available columns.

Try this:

JavaScript

Output:

JavaScript
User contributions licensed under: CC BY-SA
2 People found this is helpful
Advertisement