I’ve a dataframe look like this
JavaScript
x
5
1
0 1 2 3
2
0 {'Emotion': 'female_angry', 'Score': '90.0%'} {'Emotion': 'female_disgust', 'Score': '0.0%'} {'Emotion': 'female_fear', 'Score': '0.0%'}
3
1 {'Emotion': 'female_angry', 'Score': '0.0%'} {'Emotion': 'female_disgust', 'Score': '0.0%'} {'Emotion': 'female_fear', 'Score': '80.0%'}
4
2 {'Emotion': 'female_angry', 'Score': '0.1%'} {'Emotion': 'female_disgust', 'Score': '99.0%'} {'Emotion': 'female_fear', 'Score': '4.6%'}
5
I want to make a separate column based on highest score values.
Like so
JavaScript
1
8
1
Emotion
2
3
0 'female_angry'
4
5
1 'female_fear'
6
7
2 'female_disgust'
8
I’ve went through many ref but I can’t relate with my problem. Any suggestions?
Advertisement
Answer
You can use pandas.apply with axis=1
for iterate over each row:
JavaScript
1
5
1
df_new = df.apply(lambda row: max([tuple(dct.values()) for dct in row],
2
key= lambda x: x[1]
3
)[0], axis=1).to_frame(name = 'Emotion')
4
print(df_new)
5
Output:
JavaScript
1
5
1
Emotion
2
0 female_angry
3
1 female_fear
4
2 female_disgust
5
Explanation:
JavaScript
1
12
12
1
>>> df.apply(lambda row: [tuple(dct.values()) for dct in row], axis=1)
2
# [('female_angry', '90.0%'), ('female_disgust', '0.0%'), ('female_fear', '0.0%')]
3
# [('female_angry', '0.0%'), ('female_disgust', '0.0%'), ('female_fear', '80.0%')]
4
# [('female_angry', '0.1%'), ('female_disgust', '99.0%'), ('female_fear', '4.6%')]
5
6
>>> max([('female_angry', '90.0%'), ('female_disgust', '0.0%'), ('female_fear', '0.0%')],
7
key=lambda x : x[1])
8
# ('female_angry', '90.0%')
9
10
>>> ('female_angry', '90.0%')[0]
11
# 'female_angry'
12