Skip to content
Advertisement

LOC search string with AND condition in Python

I’m trying to use LOC with an AND condition. It works fine with OR conditions, but I can’t get it to work with ANDs when there are duplicate values in a column.

def locreplace(df,col,needle,replace,needle2=''):
   if (needle2==''):
      df.loc[df[col].str.contains(needle, case=False)==True,col] = replace
   else:
     df.loc[[df[col].str.contains(needle, case=False)==True] and df[col].str.contains(needle2, case=False)==True,col] = replace

This table with no duplicates works as expected:

#Create a data frame
data = [['granny apple', 'juicy'], ['blood orange', 'refreshing'], ['spanish lemon', 'tangy']]
fruitdf = pd.DataFrame(data, columns = ['fruit', 'taste'])

#Single replace - works
#locreplace(fruitdf,'fruit','apple','big red nice apple')

#Will fail - works
#locreplace(fruitdf,'fruit','apple','big red apple','uncle')

#Double replace - works
locreplace(fruitdf,'fruit','apple','big huge red apple','granny')

But when you create a data frame with two “granny” entries the double replace AND condition replaces both instances of “granny” even though “apple” in the AND condition isn’t being matched.

data = [['granny apple', 'juicy'], ['granny blood orange', 'refreshing'], ['spanish lemon', 'tangy']]
fruitdf = pd.DataFrame(data, columns = ['fruit', 'taste'])

#Single replace - works
#locreplace(fruitdf,'fruit','apple','big red nice apple')

#Will fail - works
#locreplace(fruitdf,'fruit','apple','big red apple','uncle')


#Double replace - fails
locreplace(fruitdf,'fruit','apple','big huge red apple','granny')


No doubt my fault, and a misplacing of brackets (or misunderstanding of code), but what is the correct way to achieve an AND condition replace with loc (or other easier method)?

Current output:

    fruit   taste
0   big huge red apple  juicy
1   big huge red apple  refreshing
2   spanish lemon   tangy

Desired output:

    fruit   taste
0   big huge red apple  juicy
1   granny blood orange refreshing
2   spanish lemon   tangy

Advertisement

Answer

The issue is in the else block in locreplace

[df[col].str.contains(needle, case=False)==True]

Which is a list with a series in the first index rather than a series. You need to remove the brackets and replace and with &

df.loc[df[col].str.contains(needle, case=False) & df[col].str.contains(needle2, case=False), col] = replace

Output

                 fruit       taste
0   big huge red apple       juicy
1  granny blood orange  refreshing
2        spanish lemon       tangy
User contributions licensed under: CC BY-SA
9 People found this is helpful
Advertisement