Calculating the percentage of an outcome per group

Have a data frame of a predictive model output that is seperated into tertiles (low, medium, and high risk). I want to calculate the percentage of people in each risk zone that have the outcome of interest.

import pandas as pd

data = {'risk_group':  ["medium", "low", "high", "low", "high", "high", ....],
        'outcome': [1, 0, 1, 0, 1, 1, ....}

df = pd.DataFrame (data, columns = ['risk_group','outcome'])

JavaScript
​x
 
import pandas as pd
​
data = {'risk_group':  ["medium", "low", "high", "low", "high", "high", ....],
        'outcome': [1, 0, 1, 0, 1, 1, ....}
​
df = pd.DataFrame (data, columns = ['risk_group','outcome'])
​
​

theoretical desired output is a dataframe that has

low : 12% w/ outcome
medium : 34% w/ outcome
high: 78% w/ outcome

JavaScript
 
low : 12% w/ outcome
medium : 34% w/ outcome
high: 78% w/ outcome
​

Answer

Use:

df.groupby('risk_group').outcome.apply(lambda x: x.sum()/x.size * 100)

JavaScript
 
df.groupby('risk_group').outcome.apply(lambda x: x.sum()/x.size * 100)
​

Advertisement

Answer