Skip to content
Advertisement

How to filter a dataframe and each row, based on the presence of strings (from another list) in different columns and add a new column with annotation

I have a dataframe (df1) where I would like to search each row for items from listA. If the dataframe has a row that contains ‘positive’ and one or more of the items from listA, I would like to generate another dataframe (df2) by adding a column called result, listing the listA item + present. Items in list A, may exist as a stand alone item in each row of df1 or they may exist as part of a larger string. I’ve tried using pandas.DataFrame.loc but I am only able to search through one column at a time which isn’t ideal.

    df1 = pd.DataFrame({'column no': ['1', '2', '3', '4'], 
                'name': ['fred', 'sammy', 'tom', 'sam'], 
                'test': ['positive', 'positive', 'negative', 'negative'],
                'date': ["15-'05", "13-'02", "12-'01", "29-'08"],
                'food':['lemon-2.v4*?-10%;ham-12?-0%;orange?-58%', 'cake', 'cheese', 'eggs']})

    listA = ["15-'05",'ham','tom','cake']

Output:

    df2 = pd.DataFrame({'column no': ['1', '2', '3', '4'],
              'name': ['fred', 'sammy', 'tom', 'sam'], 
               'test': ['positive', 'positive', 'negative', 'negative'],
               'date': ["15-'05", "13-'02", "12-'01", "29-'08"],
               'food':['lemon-2.v4*?-10%;ham-12?-0%;orange?-58%', 'cake', 'cheese', 'eggs'],
               'result': ["15-'05, ham, present", "cake, present", 'tom, present', 'not found']})

Advertisement

Answer

Updated:

I have created a function first which is applied to every row (‘axis=1’) and the results are added to the result column.

def check_rows(row):
    same_values = ', '.join([term for term in listA for substring in row.values if term in substring])
    if same_values:
        return same_values+", present"
    else:
        return 'not found'

df1['result'] = df1.apply(lambda x: check_rows(x), axis=1)

Output:

column no   name    test        date    food                                    result
0   1       fred    positive    15-'05  lemon-2.v4*?-10%;ham-12?-0%;orange?-58%  15-'05, ham, present
1   2       sammy   positive    13-'02  cake                                     cake, present
2   3        tom    negative    12-'01  cheese                                    tom, present
3   4        sam    negative    29-'08  eggs                                      not found

User contributions licensed under: CC BY-SA
3 People found this is helpful
Advertisement