Find string in a dataframe from a list in another dataframe

Question

I have 2 pandas dataframes in python which are set up as folows: Where Paragraph is a string of multiple words. Name is just a string identifying the Words. And Words is a list of strings. So what I want to do is have an expression that will identify which Paragraphs in Dataframe 1 contain Words from Dataframe 2. And

Accepted Answer

You can split and explode Paragraph. Then map the names for each word of the exploded df_2. Finally, aggregate as set to have unique values:s = df_2.explode('Words').set_index('Words')['Name']df_1['Names'] = (df_1['Paragraph'].str.split()                 .explode().map(s).dropna()                 .groupby(level=0).agg(set)                )output:   Paragraph                    Names0  A B C D E          {Second, First}1  A F G H L          {Fourth, First}2  B J P Q W  {Third, Second, Fourth}3  G F D S A                  {First}

Advertisement

Answer