I have the following pandas dataframe
+---------+-------+ | Country | value | +---------+-------+ | UK | 42 | | US | 9 | | US | 10 | | France | 15 | | France | 16 | | Germany | 17 | | Germany | 18 | | Germany | 20 | +---------+-------+
I want to create a new column that ranks each of the countries according to the mean of their values from largest to smallest
The output would look like the following
+---------+-------+---------+------+ | Country | value | Average | Rank | +---------+-------+---------+------+ | UK | 42 | 42 | 1 | | US | 9 | 9.5 | 4 | | US | 10 | 9.5 | 4 | | France | 15 | 15.5 | 3 | | France | 16 | 15.5 | 3 | | Germany | 17 | 18 | 2 | | Germany | 18 | 18 | 2 | | Germany | 20 | 18 | 2 | +---------+-------+---------+------+
Note that I don’t need the average column, its just there to help with the explanation.
Many thanks
Advertisement
Answer
Use groupby
+ transform
for mean
and then rank
:
df['Average'] = df.groupby('Country')['value'].transform('mean') df['Rank'] = df['Average'].rank(method='dense', ascending=False) print (df) Country value Average Rank 0 UK 42 42.000000 1.0 1 US 9 9.500000 4.0 2 US 10 9.500000 4.0 3 France 15 15.500000 3.0 4 France 16 15.500000 3.0 5 Germany 17 18.333333 2.0 6 Germany 18 18.333333 2.0 7 Germany 20 18.333333 2.0
Similar solution:
a = df.groupby('Country')['value'].transform('mean') b = a.rank(method='dense', ascending=False) df = df.assign(Average=a, Rank=b) print (df) Country value Average Rank 0 UK 42 42.000000 1.0 1 US 9 9.500000 4.0 2 US 10 9.500000 4.0 3 France 15 15.500000 3.0 4 France 16 15.500000 3.0 5 Germany 17 18.333333 2.0 6 Germany 18 18.333333 2.0 7 Germany 20 18.333333 2.0