Meaningless Spacy Nouns

Question

I am using Spacy for extracting nouns from sentences. These sentences are grammatically poor and may contain some spelling mistakes as well. Here is the code that I am using: Code Output: Similarly for sentence "fast foward2", I get Spacy noun as Which shows that these nouns have some meaningless words like: sfx, foward2, ms, 64x, bit, pwm, r, brailledisplayfastmovement,

Accepted Answer

It seems you can use pyenchant library:Enchant is used to check the spelling of words and suggest corrections for words that are miss-spelled. It can use many popular spellchecking packages to perform this task, including ispell, aspell and MySpell. It is quite flexible at handling multiple dictionaries and multiple languages.More information is available on the Enchant website:https://abiword.github.io/enchant/Sample Python code:import spacy, reimport enchant                        #pip install pyenchantd = enchant.Dict("en_US")nlp = spacy.load("en_core_web_sm")sentence = "For example, it filters nouns like motorbike, whoosh, trolley, metal, suitcase, zip etc"cleanString = re.sub('[W_]+',' ', sentence.lower()) # Merging W and _ into one regexdoc= nlp(cleanString)for token in doc:    if token.pos_=="NOUN" and d.check(token.text):        print (token.text)# => [example, nouns, motorbike, whoosh, trolley, metal, suitcase, zip]

Advertisement

Answer