I have a dataframe column text
text 'a red apple' 'the apple is sweet' 'a yellow banana' 'a green melon'
I would like to create another column term
by matching it with a list ['apple', 'banana, melon']
for term in the_list: df['term'] = bf['text'].apply(lambda x: term if term in x else 'None')
The result I get
text term 'a red apple' None 'the apple is sweet' None 'a yellow banana' None 'a green melon' melon
However, I expected it to be
text term 'a red apple' apple 'the apple is sweet' apple 'a yellow banana' banana 'a green melon' melon
I sense that it might be because I use a list but I don’t know how to make a loop in lambda itself
Advertisement
Answer
Using the split method will only work if the strings are the same all the time. you have to switch around the loop and lambda expression like so
df = pd.DataFrame(['a red apple', 'a banana yellow ', 'a green melon'], columns=['text']) the_list = ['apple', 'banana', 'melon'] def fruit_finder(string): term_return = 'None' for term in the_list: if term in string: term_return = term return term_return df['term'] = df['text'].apply(fruit_finder) print(df)
will return the matching value from the list
and will result in a output of
text term 0 a red apple apple 1 a banana yellow banana 2 a green melon melon
Edit: The reason you initial program doesn’t work is that your loop and lambda are mixed up. You are looping through the terms and applying only that term to the dataframe (ie your last execution of the loop is only checking for the term melon so banana and apple come up as none)