I have been looking to apply the following softmax function from https://machinelearningmastery.com/softmax-activation-function-with-python/
from scipy.special import softmax # define data data = [1, 3, 2] # calculate softmax result = softmax(data) # report the probabilities print(result) [0.09003057 0.66524096 0.24472847]
I am trying to apply this to a dataframe which is split by groups, and return the probabilites row by row for a group. My dataframe is:
import pandas as pd #Create DF d = { 'EventNo': ['10','10','12','12','12'], 'Name': ['Joe','Jack','John','James','Jim'], 'Rating':[30,32,2.5,3,4], } df = pd.DataFrame(data=d) df EventNo Name Rating 0 10 Joe 30.0 1 10 Jack 32.0 2 12 John 2.5 3 12 James 3.0 4 12 Jim 4
In this instance there are two different events (10
and 12
) where for event 10
the values are data = [30,32]
and event 12
data = [2.5,3,4]
My expected result would be a new column probabilities
with the results:
EventNo Name Rating Probabilities 0 10 Joe 30.0 0.1192 1 10 Jack 32.0 0.8807 2 12 John 2.5 0.1402 3 12 James 3.0 0.2312 4 12 Jim 4 0.6285
Any help on how to do this on all groups in the dataframe would be much appreciated! Thanks!
Advertisement
Answer
You can use groupby
followed by transform
which returns results indexed by the original dataframe. A simple way to do it would be
df["Probabilities"] = df.groupby('EventNo')["Rating"].transform(softmax)
The result is
EventNo Name Rating Probabilities 0 10 Joe 30.0 0.119203 1 10 Jack 32.0 0.880797 2 12 John 2.5 0.140244 3 12 James 3.0 0.231224 4 12 Jim 4.0 0.628532