I have the following pandas dataframe
+---------+-------+ | Country | value | +---------+-------+ | UK | 42 | | US | 9 | | US | 10 | | France | 15 | | France | 16 | | Germany | 17 | | Germany | 18 | | Germany | 20 | +---------+-------+
I want to create a new column that ranks each of the countries according to the mean of their values from largest to smallest
The output would look like the following
+---------+-------+---------+------+ | Country | value | Average | Rank | +---------+-------+---------+------+ | UK | 42 | 42 | 1 | | US | 9 | 9.5 | 4 | | US | 10 | 9.5 | 4 | | France | 15 | 15.5 | 3 | | France | 16 | 15.5 | 3 | | Germany | 17 | 18 | 2 | | Germany | 18 | 18 | 2 | | Germany | 20 | 18 | 2 | +---------+-------+---------+------+
Note that I don’t need the average column, its just there to help with the explanation.
Many thanks
Advertisement
Answer
Use groupby + transform for mean and then rank:
df['Average'] = df.groupby('Country')['value'].transform('mean')
df['Rank'] = df['Average'].rank(method='dense', ascending=False)
print (df)
Country value Average Rank
0 UK 42 42.000000 1.0
1 US 9 9.500000 4.0
2 US 10 9.500000 4.0
3 France 15 15.500000 3.0
4 France 16 15.500000 3.0
5 Germany 17 18.333333 2.0
6 Germany 18 18.333333 2.0
7 Germany 20 18.333333 2.0
Similar solution:
a = df.groupby('Country')['value'].transform('mean')
b = a.rank(method='dense', ascending=False)
df = df.assign(Average=a, Rank=b)
print (df)
Country value Average Rank
0 UK 42 42.000000 1.0
1 US 9 9.500000 4.0
2 US 10 9.500000 4.0
3 France 15 15.500000 3.0
4 France 16 15.500000 3.0
5 Germany 17 18.333333 2.0
6 Germany 18 18.333333 2.0
7 Germany 20 18.333333 2.0