I have a DataFrame and I need to create a new column and fill the values acording to how many words in a list of words are found in a text. I'm trying de code below: This code actually create a new column, but fill all the rows with the last 'count_found_words' of the loop. is there a right way

Create and fill a DataFrame column based on conditions

I have a DataFrame and I need to create a new column and fill the values acording to how many words in a list of words are found in a text. I’m trying de code below:

df = pd.DataFrame({'item': ['a1', 'a2', 'a3'], 
               'text': ['water, rainbow', 'blue, red, white','country,school,magic']})


list_of_words = ['water', 'pasta', 'black', 'magic', 'glasses', 'school' ,'book']

for index,row in df.iterrows():
    text = row['text']
        count_found_words = 0
        for word in list_of_words:
            found_words= re.findall(word, text)
            if len(found_words)>0:
                count_found_words += 1
        df['found_words'] = count_found_words

JavaScript
​x
 
df = pd.DataFrame({'item': ['a1', 'a2', 'a3'], 
               'text': ['water, rainbow', 'blue, red, white','country,school,magic']})
​
​
list_of_words = ['water', 'pasta', 'black', 'magic', 'glasses', 'school' ,'book']
​
for index,row in df.iterrows():
    text = row['text']
        count_found_words = 0
        for word in list_of_words:
            found_words= re.findall(word, text)
            if len(found_words)>0:
                count_found_words += 1
        df['found_words'] = count_found_words
​

This code actually create a new column, but fill all the rows with the last ‘count_found_words’ of the loop.

is there a right way to do this?

Answer

pattern = fr"b({'|'.join(list_of_words)})b"

df["found_words"] = df.text.str.findall(pattern).str.len()

JavaScript
 
pattern = fr"b({'|'.join(list_of_words)})b"
​
df["found_words"] = df.text.str.findall(pattern).str.len()
​

Advertisement

Answer