I have come across an issue with the np.select section of my code and have reproduced a minimal-reproducible-example to seek some advice as to why ValueError: -1 is not in range is being returned rather than nan
import numpy as np
import pandas as pd
test = {'number' : [1,2,3,4,5,6]}
df = pd.DataFrame(data=test)
print(df)
number = 1
#check row index of first value less than 'number'
print((np.abs(df['number']-number)).values.argmin()-1)
conditions = [number <= df['number'][3], number > df['number'][3]]
selection = [df['number'][(np.abs(df['number']-number)).values.argmin()-1], 'ignore'] # get first value in df['number'] column less than 'number' variable
answer = np.select(conditions, selection, default=np.nan)
print(answer)
Using df['number'][3] when number = 1 I would expect to return nan since the value located in df['number'][3] is 4 and although number = 1 is less than 4, there are is no row above the row index in df['number'] where the value is 1
Instead I get ValueError: -1 is not in range instead of nan
Advertisement
Answer
Code modification to avoid the error mentioned above – it is just a typical version of the original code but with minor modification and if statement:
import numpy as np
import pandas as pd
test = {'number' : [1,2,3,4,5,6]}
df = pd.DataFrame(data=test)
print(df)
number = 1
#check row index of first value less than 'number'
row_index_less =(np.abs(df['number']-number)).values.argmin()-1
print(row_index_less)
if row_index_less>-1:
conditions = [number <= df['number'][3], number > df['number'][3]]
selection = [df['number'][row_index_less], 'ignore']
# get first value in df['number'] column less than 'number' variable
answer = np.select(conditions, selection, default=np.nan)
else
answer = np.nan
print(answer)