I have a dataframe:
date type 2021-08-12 fail 2021-08-12 fail 2021-08-12 win 2021-08-12 great_win 2021-08-13 fail 2021-08-13 win 2021-08-13 win 2021-08-13 win
I want to calculate percentage of each ‘type’ within date group and then average values among all dates. So desired results must be:
date type type_perc 2021-08-12 fail 0.5 2021-08-12 win 0.25 2021-08-12 great_win 0.25 2021-08-13 fail 0.25 2021-08-13 win 0.75 2021-08-13 great_win 0.0
and then average among all dates. this is the desired final result:
type type_perc fail 0.375 win 0.5 great_win 0.175
How to do that?
Advertisement
Answer
You can try this:
tmp = df.groupby(['date', 'type']).size()/df.groupby('date')['type'].size() print(tmp) date type 2021-08-12 fail 0.50 great_win 0.25 win 0.25 2021-08-13 fail 0.25 win 0.75 dtype: float64 result = tmp.groupby(level=1).sum()/tmp.sum() print(result) type fail 0.375 great_win 0.125 win 0.500 dtype: float64
or this:
result = tmp.groupby(level=1).mean() print(result) type fail 0.375 great_win 0.250 win 0.500 dtype: float64
It’s not quite clear by your question