I have this Dataframe. I want to make age range 1-5, 6-10, 11-15, etc and set all values in this range by their mean.
JavaScript
x
12
12
1
Name Age
2
0 x 5
3
1 y 7
4
2 z 2
5
3 p 9
6
4 q 12
7
5 r 6
8
6 s 5
9
7 t 1
10
8 u 13
11
9 v 10
12
Now I want to add a column ageGroup which will contain the mean of the required range. Here 1-5 is a range. so all of the ages between these will mean value. Here, (5+2+5+1) // 4 = 3. Similarly, for range 11-15 will be (12+13) // 2 = 12.
So, the expected output is.
JavaScript
1
12
12
1
Name Age ageGroup
2
0 x 5 3
3
1 y 7 8
4
2 z 2 3
5
3 p 9 8
6
4 q 12 12
7
5 r 6 8
8
6 s 5 3
9
7 t 1 3
10
8 u 13 12
11
9 v 10 8
12
Advertisement
Answer
You can use pd.cut
to bin the data and then you can use with groupby:
JavaScript
1
5
1
max_age = 15
2
step = 5
3
df['ageGroup'] = df.groupby(pd.cut(df['Age'],
4
range(0,max_age+step,5)))['Age'].transform('mean').round()
5
JavaScript
1
14
14
1
print(df)
2
3
Name Age ageGroup
4
0 x 5 3.0
5
1 y 7 8.0
6
2 z 2 3.0
7
3 p 9 8.0
8
4 q 12 12.0
9
5 r 6 8.0
10
6 s 5 3.0
11
7 t 1 3.0
12
8 u 13 12.0
13
9 v 10 8.0
14