I have come across an issue with the np.select
section of my code and have reproduced a minimal-reproducible-example to seek some advice as to why ValueError: -1 is not in range
is being returned rather than nan
JavaScript
x
19
19
1
import numpy as np
2
import pandas as pd
3
4
test = {'number' : [1,2,3,4,5,6]}
5
df = pd.DataFrame(data=test)
6
7
print(df)
8
9
number = 1
10
11
#check row index of first value less than 'number'
12
print((np.abs(df['number']-number)).values.argmin()-1)
13
14
conditions = [number <= df['number'][3], number > df['number'][3]]
15
selection = [df['number'][(np.abs(df['number']-number)).values.argmin()-1], 'ignore'] # get first value in df['number'] column less than 'number' variable
16
answer = np.select(conditions, selection, default=np.nan)
17
18
print(answer)
19
Using df['number'][3]
when number = 1
I would expect to return nan
since the value located in df['number'][3]
is 4 and although number = 1
is less than 4, there are is no row above the row index in df['number']
where the value is 1
Instead I get ValueError: -1 is not in range
instead of nan
Advertisement
Answer
Code modification to avoid the error mentioned above – it is just a typical version of the original code but with minor modification and if statement
:
JavaScript
1
23
23
1
import numpy as np
2
import pandas as pd
3
4
test = {'number' : [1,2,3,4,5,6]}
5
df = pd.DataFrame(data=test)
6
7
print(df)
8
9
number = 1
10
11
#check row index of first value less than 'number'
12
row_index_less =(np.abs(df['number']-number)).values.argmin()-1
13
print(row_index_less)
14
15
if row_index_less>-1:
16
conditions = [number <= df['number'][3], number > df['number'][3]]
17
selection = [df['number'][row_index_less], 'ignore']
18
# get first value in df['number'] column less than 'number' variable
19
answer = np.select(conditions, selection, default=np.nan)
20
else
21
answer = np.nan
22
print(answer)
23