I have been looking to apply the following softmax function from https://machinelearningmastery.com/softmax-activation-function-with-python/
JavaScript
x
9
1
from scipy.special import softmax
2
# define data
3
data = [1, 3, 2]
4
# calculate softmax
5
result = softmax(data)
6
# report the probabilities
7
print(result)
8
[0.09003057 0.66524096 0.24472847]
9
I am trying to apply this to a dataframe which is split by groups, and return the probabilites row by row for a group. My dataframe is:
JavaScript
1
17
17
1
import pandas as pd
2
#Create DF
3
d = {
4
'EventNo': ['10','10','12','12','12'],
5
'Name': ['Joe','Jack','John','James','Jim'],
6
'Rating':[30,32,2.5,3,4],
7
}
8
9
df = pd.DataFrame(data=d)
10
df
11
EventNo Name Rating
12
0 10 Joe 30.0
13
1 10 Jack 32.0
14
2 12 John 2.5
15
3 12 James 3.0
16
4 12 Jim 4
17
In this instance there are two different events (10
and 12
) where for event 10
the values are data = [30,32]
and event 12
data = [2.5,3,4]
My expected result would be a new column probabilities
with the results:
JavaScript
1
7
1
EventNo Name Rating Probabilities
2
0 10 Joe 30.0 0.1192
3
1 10 Jack 32.0 0.8807
4
2 12 John 2.5 0.1402
5
3 12 James 3.0 0.2312
6
4 12 Jim 4 0.6285
7
Any help on how to do this on all groups in the dataframe would be much appreciated! Thanks!
Advertisement
Answer
You can use groupby
followed by transform
which returns results indexed by the original dataframe. A simple way to do it would be
JavaScript
1
2
1
df["Probabilities"] = df.groupby('EventNo')["Rating"].transform(softmax)
2
The result is
JavaScript
1
7
1
EventNo Name Rating Probabilities
2
0 10 Joe 30.0 0.119203
3
1 10 Jack 32.0 0.880797
4
2 12 John 2.5 0.140244
5
3 12 James 3.0 0.231224
6
4 12 Jim 4.0 0.628532
7