Skip to content
Advertisement

Create Pandas DF by searching for multiple record values across multiple columns

I am trying to create a new dataframe that can pull rows based on multiple terms across multiple columns. I have a huge excel file (65k row) I am pulling into a df so that I can pull out new priority reports.

So as an example, this is what I am using to search for multiple terms across 1 column (columnA in this example). I want to be able to do this same search (for multiple terms), but across 2 or 3 different columns instead of just columnA.

newdf = (df.loc[df.columnA.str.contains('dbcor|nopgms|swcor|bkupmems', case=False, regex=True, na=False)])

Advertisement

Answer

newdf = df.loc[df.apply(lambda row: row.str.contains('dbcor|nopgms|swcor|bkupmems', case=False, regex=True, na=False).any(), axis=1)]

will return rows where any value matches the pattern. Replace any with all if you need all values to match it.

User contributions licensed under: CC BY-SA
1 People found this is helpful
Advertisement