Trying to filter ‘time’ data into ‘time_filtered’ based on lut_lst ranges, ergo if ‘time’ value falls in any of the ranges, exchange with NaN otherwise copy value into new column.
JavaScript
x
46
46
1
import numpy as np
2
# creates look up list for ranges that need to be excluded
3
lut_lst = []
4
for i in range(0,2235,15):
5
a= range(i,2+i)
6
b= range(14+i, 15+i)
7
lut_lst.append(a)
8
lut_lst.append(b)
9
10
lut_lst
11
[range(0, 2),
12
range(14, 15),
13
range(15, 17),
14
range(29, 30),
15
range(30, 32),
16
range(44, 45),
17
range(45, 47),
18
range(59, 60),
19
20
range(2190, 2192),
21
range(2204, 2205),
22
range(2205, 2207),
23
range(2219, 2220),
24
range(2220, 2222),
25
range(2234, 2235)]
26
27
28
## if 'time' value falls in any of the ranges of lut_lst, replace values with NaN (drop row)
29
data_cols = ['filename', 'time']
30
data_vals = [['cell1', 0.0186],
31
['cell1', 0.0774],
32
['cell1', 2.2852],
33
['cell1', 2.3788],
34
['cell1', 14.62],
35
['cell1', 15.04],
36
['cell2', 20.3416],
37
['cell2', 20.9128],
38
['cell2', 29.6784],
39
['cell2', 30.1194],
40
['cell2', 32.3304]]
41
42
df = pd.DataFrame(data_vals, columns=data_cols)
43
44
# trying to filter 'time' but can't get INTO the ranges
45
df['time_filtered'] = df['time'].apply(lambda x: x if (x not in lut_lst) else np.nan)
46
The output for df is not filtered. I tried using any(lut_lst) or all(lut_lst) but that just threw an error.
JavaScript
1
14
14
1
df
2
filename record time time_filtered
3
0 cell1 1 0.0186 0.0186
4
1 cell1 1 0.0774 0.0774
5
2 cell1 1 2.2852 2.2852
6
3 cell1 25 2.3788 2.3788
7
4 cell1 25 14.6200 14.6200
8
5 cell1 101 15.0400 15.0400
9
6 cell2 2 20.3416 20.3416
10
7 cell2 2 20.9128 20.9128
11
8 cell2 50 29.6784 29.6784
12
9 cell2 50 30.1194 30.1194
13
10 cell2 80 32.3304 32.3304
14
Advertisement
Answer
Use tuples instead of ranges in lut_lst, and change your filter slightly:
JavaScript
1
32
32
1
import numpy as np
2
# creates look up list for ranges that need to be excluded
3
lut_lst = []
4
for i in range(0,2235,15):
5
a= i,2+i
6
b= 14+i, 15+i
7
lut_lst.append(a)
8
lut_lst.append(b)
9
10
## if 'time' value falls in any of the ranges of lut_lst, replace values
11
with NaN (drop row)
12
data_cols = ['filename', 'time']
13
data_vals = [['cell1', 0.0186],
14
['cell1', 0.0774],
15
['cell1', 2.2852],
16
['cell1', 2.3788],
17
['cell1', 14.62],
18
['cell1', 15.04],
19
['cell2', 20.3416],
20
['cell2', 20.9128],
21
['cell2', 29.6784],
22
['cell2', 30.1194],
23
['cell2', 32.3304]]
24
25
df = pd.DataFrame(data_vals, columns=data_cols)
26
27
28
df['time_filtered'] = df['time'].apply(lambda x: x if not any([a < x < b
29
for a,b in lut_lst]) else np.nan)
30
31
df
32
Output:
JavaScript
1
13
13
1
filename time time_filtered
2
0 cell1 0.0186 NaN
3
1 cell1 0.0774 NaN
4
2 cell1 2.2852 2.2852
5
3 cell1 2.3788 2.3788
6
4 cell1 14.6200 NaN
7
5 cell1 15.0400 NaN
8
6 cell2 20.3416 20.3416
9
7 cell2 20.9128 20.9128
10
8 cell2 29.6784 NaN
11
9 cell2 30.1194 NaN
12
10 cell2 32.3304 32.3304
13