Skip to content
Advertisement

How to choose values from col1 if their values are in col2 but not on list in Python Pandas?

I have DataFrame in Python Pandas like below:

col1       | description
---------- |-----------
John Simon |John Simon red
Terry Juk  |green Terry Juk
John Bravo |John Bravo brown
Ann Still  |orange Ann Still

bad_list = ["red", "green"]

And I need to select only these people from “col1” which have value from “col” in column “description” and something else (nevermind before or after), but it something else could not be from bad_list.

So I need to select only John Bravo and Ann Still because they have their value from “col1” in column “description” and does not have words from bad_list with their name in column “description”.

How to do that in Python Pandas?

Advertisement

Answer

Try:

bad_list = ["red", "green"]

mask = df["description"].str.contains(r"|".join(bad_list))
print(df.loc[~mask, "col1"])

Prints:

2    John Bravo
3     Ann Still
Name: col1, dtype: object

EDIT: To check if value from col1 is in description:

bad_list = ["red", "green"]

mask1 = df["description"].str.contains(r"|".join(bad_list))
mask2 = df.apply(lambda x: x["col1"] in x["description"], axis=1)
print(df.loc[~mask1 & mask2, "col1"])

EDIT2: To ignore case:

import re

bad_list = ["RED", "green"]

mask1 = df["description"].str.contains(r"|".join(bad_list), flags=re.I)
mask2 = df.apply(lambda x: x["col1"] in x["description"], axis=1)
print(df.loc[~mask1 & mask2, "col1"])
User contributions licensed under: CC BY-SA
10 People found this is helpful
Advertisement