I have come across an issue with the np.select
section of my code and have reproduced a minimal-reproducible-example to seek some advice as to why ValueError: -1 is not in range
is being returned rather than nan
import numpy as np import pandas as pd test = {'number' : [1,2,3,4,5,6]} df = pd.DataFrame(data=test) print(df) number = 1 #check row index of first value less than 'number' print((np.abs(df['number']-number)).values.argmin()-1) conditions = [number <= df['number'][3], number > df['number'][3]] selection = [df['number'][(np.abs(df['number']-number)).values.argmin()-1], 'ignore'] # get first value in df['number'] column less than 'number' variable answer = np.select(conditions, selection, default=np.nan) print(answer)
Using df['number'][3]
when number = 1
I would expect to return nan
since the value located in df['number'][3]
is 4 and although number = 1
is less than 4, there are is no row above the row index in df['number']
where the value is 1
Instead I get ValueError: -1 is not in range
instead of nan
Advertisement
Answer
Code modification to avoid the error mentioned above – it is just a typical version of the original code but with minor modification and if statement
:
import numpy as np import pandas as pd test = {'number' : [1,2,3,4,5,6]} df = pd.DataFrame(data=test) print(df) number = 1 #check row index of first value less than 'number' row_index_less =(np.abs(df['number']-number)).values.argmin()-1 print(row_index_less) if row_index_less>-1: conditions = [number <= df['number'][3], number > df['number'][3]] selection = [df['number'][row_index_less], 'ignore'] # get first value in df['number'] column less than 'number' variable answer = np.select(conditions, selection, default=np.nan) else answer = np.nan print(answer)