Apply a softmax function on groupby in the same pandas dataframe

I have been looking to apply the following softmax function from https://machinelearningmastery.com/softmax-activation-function-with-python/

from scipy.special import softmax
# define data
data = [1, 3, 2]
# calculate softmax
result = softmax(data)
# report the probabilities
print(result)
[0.09003057 0.66524096 0.24472847]

JavaScript
​x
 
from scipy.special import softmax
# define data
data = [1, 3, 2]
# calculate softmax
result = softmax(data)
# report the probabilities
print(result)
[0.09003057 0.66524096 0.24472847]
​

I am trying to apply this to a dataframe which is split by groups, and return the probabilites row by row for a group. My dataframe is:

import pandas as pd
#Create DF
d = { 
     'EventNo': ['10','10','12','12','12'],
    'Name': ['Joe','Jack','John','James','Jim'],
    'Rating':[30,32,2.5,3,4],
    }
             
df = pd.DataFrame(data=d)
df
    EventNo Name    Rating
0   10       Joe       30.0
1   10       Jack      32.0
2   12       John      2.5
3   12       James     3.0
4   12       Jim       4

JavaScript
 
import pandas as pd
#Create DF
d = { 
     'EventNo': ['10','10','12','12','12'],
    'Name': ['Joe','Jack','John','James','Jim'],
    'Rating':[30,32,2.5,3,4],
    }
             
df = pd.DataFrame(data=d)
df
    EventNo Name    Rating
0   10       Joe       30.0
1   10       Jack      32.0
2   12       John      2.5
3   12       James     3.0
4   12       Jim       4
​

In this instance there are two different events (10 and 12) where for event 10 the values are data = [30,32] and event 12 data = [2.5,3,4]

My expected result would be a new column probabilities with the results:

    EventNo Name    Rating     Probabilities
0   10       Joe       30.0       0.1192
1   10       Jack      32.0       0.8807
2   12       John      2.5         0.1402
3   12       James     3.0        0.2312
4   12       Jim       4          0.6285

JavaScript
 
    EventNo Name    Rating     Probabilities
0   10       Joe       30.0       0.1192
1   10       Jack      32.0       0.8807
2   12       John      2.5         0.1402
3   12       James     3.0        0.2312
4   12       Jim       4          0.6285
​

Any help on how to do this on all groups in the dataframe would be much appreciated! Thanks!

Answer

You can use groupby followed by transform which returns results indexed by the original dataframe. A simple way to do it would be

df["Probabilities"] = df.groupby('EventNo')["Rating"].transform(softmax)

JavaScript
 
df["Probabilities"] = df.groupby('EventNo')["Rating"].transform(softmax)
​

The result is

  EventNo   Name  Rating  Probabilities
0      10    Joe    30.0       0.119203
1      10   Jack    32.0       0.880797
2      12   John     2.5       0.140244
3      12  James     3.0       0.231224
4      12    Jim     4.0       0.628532

JavaScript
 
  EventNo   Name  Rating  Probabilities
0      10    Joe    30.0       0.119203
1      10   Jack    32.0       0.880797
2      12   John     2.5       0.140244
3      12  James     3.0       0.231224
4      12    Jim     4.0       0.628532
​

Advertisement

Answer