With the following dataframe:
Sentence 0 This is an example of sentence 1 This is another example 2 This is an dfferent example 3 A sentence is a bag of words 4 Random words
And the following list:
['sentence', 'another', 'words']
What is the most efficient way to summarize the occurrence of each word from the list in each row of the column ‘Sentence’? I’m looking for the following result:
Sentence word_occurence 0 This is an example of sentence sentence 1 This is another example another 2 This is an dfferent example 3 A sentence is a bag of words [sentence, words] 4 Random words words
Thanks in advance!
Advertisement
Answer
You can do it using apply function as well:
df.assign(word_occurence = lambda x: x.sentence.apply(lambda s: np.array([witem for witem in w if witem in s])))