I have a data frame like below,
JavaScript
x
10
10
1
Name = ['A','A','A','A','A','A','B','B','B','B','B','B','B']
2
Id = ['10','10','10','10','10','10','20','20','20','20','20','20','20']
3
Depth_Feet = ['69.1','70.5','71.4','72.8','73.2','74.2','208.0','209.2','210.2','211.0','211.2','211.7','212.5']
4
Val = ['2','3.1','1.1','2.1','6.0','1.1','1.2','1.3','3.1','2.9','5.0','6.1','3.2']
5
d = {'Name':Name,'Id':Id,'Depth_Feet':Depth_Feet,'Val':Val}
6
df = pd.DataFrame(d)
7
8
print (df.head(20))
9
10
JavaScript
1
16
16
1
Depth_Feet Id Name Val
2
0 69.1 10 A 2
3
1 70.5 10 A 3.1
4
2 71.4 10 A 1.1
5
3 72.8 10 A 2.1
6
4 73.2 10 A 6.0
7
5 74.2 10 A 1.1
8
6 208.0 20 B 1.2
9
7 209.2 20 B 1.3
10
8 210.2 20 B 3.1
11
9 211.0 20 B 2.9
12
10 211.2 20 B 5.0
13
11 211.7 20 B 6.1
14
12 212.5 20 B 3.2
15
16
I want to reduce the size of data frame by Depth_Feet column (let’s say every 2 feet). Desired output is
JavaScript
1
8
1
Depth_Feet Id Name Val
2
0 69.1 10 A 2
3
1 71.4 10 A 1.1
4
2 73.2 10 A 6.0
5
3 208.0 20 B 1.2
6
4 210.2 20 B 3.1
7
5 212.5 20 B 3.2
8
I have tried few options like round and group by etc, but I’m not able to get the result I want.
Advertisement
Answer
If need each 2 rows per groups:
JavaScript
1
12
12
1
df1 = df[df.groupby('Name').cumcount() % 2 == 0]
2
print (df1)
3
4
Name Id Depth_Feet Val
5
0 A 10 69.1 2
6
2 A 10 71.4 1.1
7
4 A 10 73.2 6.0
8
6 B 20 208.0 1.2
9
8 B 20 210.2 3.1
10
10 B 20 211.2 5.0
11
12 B 20 212.5 3.2
12
If need resample by 2 per groups convert values to TimedeltaIndex:
JavaScript
1
15
15
1
df2 = (df.set_index(pd.to_timedelta(df.Depth_Feet.astype(float), unit='D'))
2
.groupby('Name')
3
.resample('2D')
4
.first()
5
.reset_index(drop=True))
6
print (df2)
7
8
Name Id Depth_Feet Val
9
0 A 10 69.1 2
10
1 A 10 71.4 1.1
11
2 A 10 73.2 6.0
12
3 B 20 208.0 1.2
13
4 B 20 210.2 3.1
14
5 B 20 212.5 3.2
15