I am trying to create a new dataframe that can pull rows based on multiple terms across multiple columns. I have a huge excel file (65k row) I am pulling into a df so that I can pull out new priority reports.
So as an example, this is what I am using to search for multiple terms across 1 column (columnA in this example). I want to be able to do this same search (for multiple terms), but across 2 or 3 different columns instead of just columnA.
newdf = (df.loc[df.columnA.str.contains('dbcor|nopgms|swcor|bkupmems', case=False, regex=True, na=False)])
Advertisement
Answer
newdf = df.loc[df.apply(lambda row: row.str.contains('dbcor|nopgms|swcor|bkupmems', case=False, regex=True, na=False).any(), axis=1)]
will return rows where any value matches the pattern. Replace any
with all
if you need all values to match it.