Tag: spacy

Convert from Prodigy’s JSONL format for labeled NER to spaCy’s training format?

named-entity-recognition nlp prodigy spacy sqlite

I am new to Prodigy and spaCy as well as CLI coding. I’d like to use Prodigy to label my data for an NER model, and then use spaCy in python to create models. Prodigy outputs in SQLite format. SpaCy takes in this other kind of format, not sure what to call it: How can I convert from one to

How to solve Spanish lemmatization problems with SpaCy?

lemmatization python spacy

When trying lemmatize in Spanish a csv with more than 60,000 words, SpaCy does not correctly write certain words, I understand that the model is not 100% accurate. However, I have not found any other solution, since NLTK does not bring a Spanish core. A friend tried to ask this question in Spanish Stackoverflow, however, the community is quite small

Getting a specific element from a list of tuples

python spacy tuples

In using Spacy, I have the following: It is a list of tuples. I want to extract the person element. This is what I do: What would be a better way? Answer To keep all first element if second element is PERSON from first list use a list comprehension notation with a if at the end This corresponds to

Is it possible to use spacy with already tokenized input?

nlp python spacy

I have a sentence that has already been tokenized into words. I want to get the part of speech tag for each word in the sentence. When I check the documentation in SpaCy I realized it starts with the raw sentence. I don’t want to do that because in that case, the spacy might end up with a different tokenization.

ImportError: No module named ‘spacy.en’

python spacy

I’m working on a codebase that uses Spacy. I installed spacy using: and then At the end of this last command, I got a message: Now, when I try running my code, on the line: it gives me the following error: I’ve looked on Stackexchange and the closest is: Import error with spacy: “No module named en” which does not

Spacy link error

models python spacy

When running: the following is printed: Warning: no model found for ‘en’ Only loading the ‘en’ tokenizer. /site-packages/spacy/data is empty with the exception of the init file. all filepaths are only pointing to my single installation of python. Any help appreciated on resolving this. Thanks! Will Answer I had this same issue when I tried this on Windows 10 –

What do spaCy’s part-of-speech and dependency tags mean?

nlp python spacy

spaCy tags up each of the Tokens in a Document with a part of speech (in two different formats, one stored in the pos and pos_ properties of the Token and the other stored in the tag and tag_ properties) and a syntactic dependency to its .head token (stored in the dep and dep_ properties). Some of these tags are

How to get the dependency tree with spaCy?

python spacy

I have been trying to find how to get the dependency tree with spaCy but I can’t find anything on how to get the tree, only on how to navigate the tree. Answer It turns out, the tree is available through the tokens in a document. Would you want to find the root of the tree, you can just go