I have the following pandas dataframe
JavaScript
x
13
13
1
+---------+-------+
2
| Country | value |
3
+---------+-------+
4
| UK | 42 |
5
| US | 9 |
6
| US | 10 |
7
| France | 15 |
8
| France | 16 |
9
| Germany | 17 |
10
| Germany | 18 |
11
| Germany | 20 |
12
+---------+-------+
13
I want to create a new column that ranks each of the countries according to the mean of their values from largest to smallest
The output would look like the following
JavaScript
1
13
13
1
+---------+-------+---------+------+
2
| Country | value | Average | Rank |
3
+---------+-------+---------+------+
4
| UK | 42 | 42 | 1 |
5
| US | 9 | 9.5 | 4 |
6
| US | 10 | 9.5 | 4 |
7
| France | 15 | 15.5 | 3 |
8
| France | 16 | 15.5 | 3 |
9
| Germany | 17 | 18 | 2 |
10
| Germany | 18 | 18 | 2 |
11
| Germany | 20 | 18 | 2 |
12
+---------+-------+---------+------+
13
Note that I don’t need the average column, its just there to help with the explanation.
Many thanks
Advertisement
Answer
Use groupby
+ transform
for mean
and then rank
:
JavaScript
1
13
13
1
df['Average'] = df.groupby('Country')['value'].transform('mean')
2
df['Rank'] = df['Average'].rank(method='dense', ascending=False)
3
print (df)
4
Country value Average Rank
5
0 UK 42 42.000000 1.0
6
1 US 9 9.500000 4.0
7
2 US 10 9.500000 4.0
8
3 France 15 15.500000 3.0
9
4 France 16 15.500000 3.0
10
5 Germany 17 18.333333 2.0
11
6 Germany 18 18.333333 2.0
12
7 Germany 20 18.333333 2.0
13
Similar solution:
JavaScript
1
15
15
1
a = df.groupby('Country')['value'].transform('mean')
2
b = a.rank(method='dense', ascending=False)
3
4
df = df.assign(Average=a, Rank=b)
5
print (df)
6
Country value Average Rank
7
0 UK 42 42.000000 1.0
8
1 US 9 9.500000 4.0
9
2 US 10 9.500000 4.0
10
3 France 15 15.500000 3.0
11
4 France 16 15.500000 3.0
12
5 Germany 17 18.333333 2.0
13
6 Germany 18 18.333333 2.0
14
7 Germany 20 18.333333 2.0
15