Skip to content
Advertisement

Search multi-string in pandas column

I have a pandas dataframe which looks like this:

       col1   col2
   0   1      A
   1   10     A, B
   2   20     B
   3   5      C
   4   70     A, B, C

Now I want to search col2 on a given condition and select rows accordingly. For example:

search_pattern = ["A"] -> Select all rows where A is present [rows 0, 1, 4]
search_pattern = ["A", "B"] -> Select all rows where A is present and B is present [rows 1, 4]
search_pattern = ["B"] -> Select all rows where B is present [rows 1, 2, 4]

    

Advertisement

Answer

You can split values and compare with sets by issubset in Series.map:

search_pattern1 = ["A"] 
search_pattern2 = ["A", "B"] 
search_pattern3 = ["B"] 

pats = [search_pattern1, search_pattern2, search_pattern3]

for pat in pats:
    df1 = df[df['col2'].str.split(', ').map(set(pat).issubset)]
    print (df1)

   col1     col2
0     1        A
1    10     A, B
4    70  A, B, C
   col1     col2
1    10     A, B
4    70  A, B, C
   col1     col2
1    10     A, B
2    20        B
4    70  A, B, C
User contributions licensed under: CC BY-SA
7 People found this is helpful
Advertisement