I’m using Spacy and I am looking for a program that counts the frequencies of each word in a text, and output each word with its count and sentence numbers where it appears. Sample input
Python is cool. But Ocaml is cooler since it is purely functional.
Sample output
1 Python 1 3 is 1 2 1 cool 1 1 But 2 1 Ocaml 2 1 cooler 2 1 since 2 1 it 2 1 purely 2 1 functional 2
Advertisement
Answer
I would split the sentence into words and create a dictionary with each key being a word in the text, like so:
text = "Python is cool. But Ocaml is cooler since it is purely functional." specialSymbols = '.,;:' words = [[word.strip(specialSymbols) for word in sentence.split(' ')] for sentence in text.split('. ')] d = {word: [0, []] for sentence in words for word in sentence} for i, sentence in enumerate(words): for word in sentence: d[word][0] += 1 if i + 1 not in d[word][1]: d[word][1].append(i + 1) for key, val in d.items(): print(f'{val[0]} {key} {" ".join([str(i) for i in val[1]])}')