I’m trying to search for key terms that are contained in one dataframe in another, returning each one when it is found in the second dataframe.
My code below words to extract the keywords. However, some of the keywords overlap and it only pulls the first result it finds, when I would like it to pull as many matches as are present:
df1
id | keyword |
---|---|
0 | we are |
1 | we |
2 | this is |
df2
id | Sentence | Result [with current code] | Result [what I want] |
---|---|---|---|
0 | we are us | we | we are, we |
1 | this is who we | this is, we | this is, we |
Keywords = df1['Keyword'] s = set(Keywords) df2['Result'] = df2['Sentence'].apply(lambda x: ', '.join(set(x.split()).intersection(s)))
I don’t need it to be particularly quick, but I would like it to be accurate and give me every related result.
Advertisement
Answer
Use
df2['result'] = [', '.join([words for words in df1.keyword if words in sentence]) for sentence in df2.Sentence] print(df2) id Sentence result 0 0 we are us we are, we 1 1 this is who we we, this is