I have a dataframe:
JavaScript
x
10
10
1
date type
2
2021-08-12 fail
3
2021-08-12 fail
4
2021-08-12 win
5
2021-08-12 great_win
6
2021-08-13 fail
7
2021-08-13 win
8
2021-08-13 win
9
2021-08-13 win
10
I want to calculate percentage of each ‘type’ within date group and then average values among all dates. So desired results must be:
JavaScript
1
8
1
date type type_perc
2
2021-08-12 fail 0.5
3
2021-08-12 win 0.25
4
2021-08-12 great_win 0.25
5
2021-08-13 fail 0.25
6
2021-08-13 win 0.75
7
2021-08-13 great_win 0.0
8
and then average among all dates. this is the desired final result:
JavaScript
1
5
1
type type_perc
2
fail 0.375
3
win 0.5
4
great_win 0.175
5
How to do that?
Advertisement
Answer
You can try this:
JavaScript
1
20
20
1
tmp = df.groupby(['date', 'type']).size()/df.groupby('date')['type'].size()
2
print(tmp)
3
4
date type
5
2021-08-12 fail 0.50
6
great_win 0.25
7
win 0.25
8
2021-08-13 fail 0.25
9
win 0.75
10
dtype: float64
11
12
result = tmp.groupby(level=1).sum()/tmp.sum()
13
print(result)
14
15
type
16
fail 0.375
17
great_win 0.125
18
win 0.500
19
dtype: float64
20
or this:
JavaScript
1
8
1
result = tmp.groupby(level=1).mean()
2
print(result)
3
type
4
fail 0.375
5
great_win 0.250
6
win 0.500
7
dtype: float64
8
It’s not quite clear by your question