Skip to content
Advertisement

Downsample non timeseries pandas dataframe

I have a data frame like below,

Name = ['A','A','A','A','A','A','B','B','B','B','B','B','B']
Id = ['10','10','10','10','10','10','20','20','20','20','20','20','20']
Depth_Feet = ['69.1','70.5','71.4','72.8','73.2','74.2','208.0','209.2','210.2','211.0','211.2','211.7','212.5']
Val = ['2','3.1','1.1','2.1','6.0','1.1','1.2','1.3','3.1','2.9','5.0','6.1','3.2']
d = {'Name':Name,'Id':Id,'Depth_Feet':Depth_Feet,'Val':Val}
df = pd.DataFrame(d)

print (df.head(20))

 Depth_Feet  Id Name  Val
0        69.1  10    A    2
1        70.5  10    A  3.1
2        71.4  10    A  1.1
3        72.8  10    A  2.1
4        73.2  10    A  6.0
5        74.2  10    A  1.1
6       208.0  20    B  1.2
7       209.2  20    B  1.3
8       210.2  20    B  3.1
9       211.0  20    B  2.9
10      211.2  20    B  5.0
11      211.7  20    B  6.1
12      212.5  20    B  3.2

I want to reduce the size of data frame by Depth_Feet column (let’s say every 2 feet). Desired output is

  Depth_Feet  Id Name  Val
0       69.1  10    A    2
1       71.4  10    A  1.1
2       73.2  10    A  6.0
3      208.0  20    B  1.2
4      210.2  20    B  3.1
5      212.5  20    B  3.2

I have tried few options like round and group by etc, but I’m not able to get the result I want.

Advertisement

Answer

If need each 2 rows per groups:

df1 = df[df.groupby('Name').cumcount() % 2 == 0]
print (df1)

   Name  Id Depth_Feet  Val
0     A  10       69.1    2
2     A  10       71.4  1.1
4     A  10       73.2  6.0
6     B  20      208.0  1.2
8     B  20      210.2  3.1
10    B  20      211.2  5.0
12    B  20      212.5  3.2

If need resample by 2 per groups convert values to TimedeltaIndex:

df2 = (df.set_index(pd.to_timedelta(df.Depth_Feet.astype(float), unit='D'))
         .groupby('Name')
         .resample('2D')
         .first()
         .reset_index(drop=True))
print (df2)

  Name  Id Depth_Feet  Val
0    A  10       69.1    2
1    A  10       71.4  1.1
2    A  10       73.2  6.0
3    B  20      208.0  1.2
4    B  20      210.2  3.1
5    B  20      212.5  3.2
User contributions licensed under: CC BY-SA
5 People found this is helpful
Advertisement