Skip to content
Advertisement

Counting word frequency in a sentence

I have two columns – one with sentences and the other with single words.

Sentence word
“Such a day! It’s a beautiful day out there” “beautiful”
“Such a day! It’s a beautiful day out there” “day”
“I am sad by the sad weather” “weather”
“I am sad by the sad weather” “sad”

I want to count the frequency of the “word” column in the “sentence” column and achieve this output:

Sentence word n
“Such a day! It’s a beautiful day out there” “beautiful” 1
“Such a day! It’s a beautiful day out there” “day” 2
“I am sad by the sad weather” “weather” 1
“I am sad by the sad weather” “sad” 2

I tried:

ok = []
for l in [x.split() for x in df['Sentence']]:
    for y in df['word']:
        ok.append(l.count(y))

However it does NOT stop running and takes A VERY long time, so is not feasible for my actual dataset as it has 50k rows.

Anyone can help to achieve this?

Advertisement

Answer

You can do it with zip

df['new'] = [x.count(y) for x, y in zip(df.Sentence,df.word)]
df
Out[419]: 
                                     Sentence       word  new
0  Such a day! It's a beautiful day out there  beautiful    1
1  Such a day! It's a beautiful day out there        day    2
2                 I am sad by the sad weather    weather    1
3                 I am sad by the sad weather        sad    2
User contributions licensed under: CC BY-SA
7 People found this is helpful
Advertisement