Finding words within paragraph using Python [closed]

Question

Closed. This question needs to be more focused. It is not currently accepting answers. Want to improve this question? Update the question so it focuses on one problem only by editing this post. Closed 2 years ago. Improve this question Let say I have the following words, Test_wrds = ['she', 'her','women'] that I would like to see whether any one

Accepted Answer

First of all in order to count how many times each word from Test_wrds list exists in text you can use ORTH which is an ID of the verbatim text content (see here).import spacyfrom spacy.lang.en import Englishfrom spacy.attrs import ORTHtext=" Q: What recent discussions she has had with the Secretary of State for Work and Pensions on the effect of that Department’s welfare policies on women."Test_wrds = ['she', 'her','women']nlp = English()doc = nlp(text)# Dictionairy with keys each word's id representation and values the number of times this word appears in your text stringcount_number = doc.count_by(ORTH)for wid, number in sorted(count_number.items(), key=lambda x: x[1]):    # nlp.vocap.strings[wid] gives the word corresponding to id    if nlp.vocab.strings[wid] in Test_wrds:        print(number, nlp.vocab.strings[wid])Output:1 she1 womenSecond, in order to replace each word with bold you can tryimport re# Avoid words followed by '.' without empty spacetext = text.replace('.', ' .')lista = text.split()for word in Test_wrds:    if word in lista:        indices = [i for i,j in enumerate(lista) if j==word] # Find list indices        for index in indices:            lista[index] = re.sub(lista[index], '**'+word+'**', lista[index])            new_text = ' '.join(lista)Output :>>> new_text'Q: What recent discussions **she** has had with the Secretary of State for Work and Pensions on the effect of that Department’s welfare policies on **women** .'

Advertisement

Answer