Retrieve all occurrencies from selected attributes to separate column in pandas

Question

want to extract color from the product descriptions. I tried to use NER but it was nt successful. Now I am trying to define a list and match it with description. I have data in dataframe column like this: I defined also the list of colors What I did was to create a matcher And I applied it to the

Accepted Answer

Jezreel&#8217;s first answer is very good! however when usingdf['Colours'] = df['Description pre-work'].str.findall('|'.join(attributes), flags=re.I)it will always find red when words such as &#8220;Tampered &#8221; and suchI suggest an easy quick fix (which is not the most robust one) butdef matcher(desc):    colors = []    # split sentence to words and find and exact much    words = desc.lower().replace(';', ' ').replace('-', ' ').replace('/', ' ').split(" ")    for color in attributes:        if color in words:            colors.append(color)    return colors

Advertisement

Answer