Skip to content
Advertisement

Tag: spacy

How to solve Spanish lemmatization problems with SpaCy?

When trying lemmatize in Spanish a csv with more than 60,000 words, SpaCy does not correctly write certain words, I understand that the model is not 100% accurate. However, I have not found any other solution, since NLTK does not bring a Spanish core. A friend tried to ask this question in Spanish Stackoverflow, however, the community is quite small

Getting a specific element from a list of tuples

In using Spacy, I have the following: It is a list of tuples. I want to extract the person element. This is what I do: What would be a better way? Answer To keep all first element if second element is PERSON from first list use a list comprehension notation with a if at the end This corresponds to

Is it possible to use spacy with already tokenized input?

I have a sentence that has already been tokenized into words. I want to get the part of speech tag for each word in the sentence. When I check the documentation in SpaCy I realized it starts with the raw sentence. I don’t want to do that because in that case, the spacy might end up with a different tokenization.

ImportError: No module named ‘spacy.en’

I’m working on a codebase that uses Spacy. I installed spacy using: and then At the end of this last command, I got a message: Now, when I try running my code, on the line: it gives me the following error: I’ve looked on Stackexchange and the closest is: Import error with spacy: “No module named en” which does not

Spacy link error

When running: the following is printed: Warning: no model found for ‘en’ Only loading the ‘en’ tokenizer. /site-packages/spacy/data is empty with the exception of the init file. all filepaths are only pointing to my single installation of python. Any help appreciated on resolving this. Thanks! Will Answer I had this same issue when I tried this on Windows 10 –

What do spaCy’s part-of-speech and dependency tags mean?

spaCy tags up each of the Tokens in a Document with a part of speech (in two different formats, one stored in the pos and pos_ properties of the Token and the other stored in the tag and tag_ properties) and a syntactic dependency to its .head token (stored in the dep and dep_ properties). Some of these tags are

How to get the dependency tree with spaCy?

I have been trying to find how to get the dependency tree with spaCy but I can’t find anything on how to get the tree, only on how to navigate the tree. Answer It turns out, the tree is available through the tokens in a document. Would you want to find the root of the tree, you can just go

Advertisement