I want to retain the string with the largest value based on a dictionary’s key and value. Any suggestion to how to do it effectively?
JavaScript
x
27
27
1
fruit_dict = {
2
"Apple": 10,
3
"Watermelon": 20,
4
"Cherry": 30
5
}
6
7
df = pd.DataFrame(
8
{
9
"ID": [1, 2, 3, 4, 5],
10
"name": [
11
"Apple, Watermelon",
12
"Cherry, Watermelon",
13
"Apple",
14
"Cherry, Apple",
15
"Cherry",
16
],
17
}
18
)
19
20
ID name
21
0 1 Apple, Watermelon
22
1 2 Cherry, Watermelon
23
2 3 Apple
24
3 4 Cherry, Apple
25
4 5 Cherry
26
27
Expected output:
JavaScript
1
7
1
ID name
2
0 1 Watermelon
3
1 2 Cherry
4
2 3 Apple
5
3 4 Cherry
6
4 5 Cherry
7
Advertisement
Answer
One way it to use apply
with max
and fruit_dict.get
as key:
JavaScript
1
4
1
new_df = (df.assign(name=df['name'].str.split(', ')
2
.apply(lambda l: max(l, key=fruit_dict.get)))
3
)
4
or, if you expect some names to be missing from the dictionary:
JavaScript
1
4
1
new_df = (df.assign(name=df['name'].str.split(', ')
2
.apply(lambda l: max(l, key=lambda x: fruit_dict.get(x, float('-inf'))))
3
)
4
output:
JavaScript
1
7
1
ID name
2
0 1 Watermelon
3
1 2 Cherry
4
2 3 Apple
5
3 4 Cherry
6
4 5 Cherry
7