Skip to content
Advertisement

Issue w/ pandas.index.get_loc() when match is found, TypeError: (“‘>’ not supported between instances of ‘NoneType’ and ‘str'”, ‘occurred at index 1’)

Below is the example to reproduce the error:

testx1df = pd.DataFrame()
testx1df['A'] = [100,200,300,400]
testx1df['B'] = [15,60,35,11]
testx1df['C'] = [11,45,22,9]
testx1df['D'] = [5,15,11,3]
testx1df['E'] = [1,6,4,0]


(testx1df[testx1df < 6].apply(lambda x: x.index.get_loc(x.first_valid_index(), method='ffill'), axis=1))

The desired output should be a list or array with the values [3,NaN,4,3]. The NaN because it does not satisfy the criteria.

I checked the pandas references and it says that for cases when you do not have an exact match you can change the “method” to ‘fill’, ‘brill’, or ‘nearest’ to pick the previous, next, or closest index. Based on this, if i indicated the method as ‘ffill’ it would give me an index of 4 instead of NaN. However, when i do so it does not work and i get the error show in the question title. For criteria higher than 6 it works fine but it doesn’t for less than 6 due to the fact that the second row in the data frame does not satisfy it.

Is there a way around this issue? should it not work for my example(return previous index of 3 or 4)?

One solution i thought of is to add a dummy column populated by zeros so that is has a place to “find” and index that satisfies the criteria but this is a bit crude to me and i think there is a more efficient solution out there.

Advertisement

Answer

please try this:

import numpy as np
ls = list(testx1df[testx1df<6].T.isna().sum())
ls = [np.nan if x==testx1df.shape[1] else x for x in ls]
print(ls)
User contributions licensed under: CC BY-SA
8 People found this is helpful
Advertisement