I have been looking to apply the following softmax function from https://machinelearningmastery.com/softmax-activation-function-with-python/
from scipy.special import softmax # define data data = [1, 3, 2] # calculate softmax result = softmax(data) # report the probabilities print(result) [0.09003057 0.66524096 0.24472847]
I am trying to apply this to a dataframe which is split by groups, and return the probabilites row by row for a group. My dataframe is:
import pandas as pd
#Create DF
d = { 
     'EventNo': ['10','10','12','12','12'],
    'Name': ['Joe','Jack','John','James','Jim'],
    'Rating':[30,32,2.5,3,4],
    }
             
df = pd.DataFrame(data=d)
df
    EventNo Name    Rating
0   10       Joe       30.0
1   10       Jack      32.0
2   12       John      2.5
3   12       James     3.0
4   12       Jim       4
In this instance there are two different events (10 and 12) where for event 10 the values are data = [30,32] and event 12 data = [2.5,3,4]
My expected result would be a new column probabilities with the results:
EventNo Name Rating Probabilities 0 10 Joe 30.0 0.1192 1 10 Jack 32.0 0.8807 2 12 John 2.5 0.1402 3 12 James 3.0 0.2312 4 12 Jim 4 0.6285
Any help on how to do this on all groups in the dataframe would be much appreciated! Thanks!
Advertisement
Answer
You can use groupby followed by transform which returns results indexed by the original dataframe. A simple way to do it would be
df["Probabilities"] = df.groupby('EventNo')["Rating"].transform(softmax)
The result is
EventNo Name Rating Probabilities 0 10 Joe 30.0 0.119203 1 10 Jack 32.0 0.880797 2 12 John 2.5 0.140244 3 12 James 3.0 0.231224 4 12 Jim 4.0 0.628532
