I’m trying to understand a Key-Bigram extractor’s working and I cannot understand what does the following block of code do. Here is the source code. Everything else is workin fine and I understood well, however I can not understand what child for child in possible_words.children does. Answer token.children uses the dependency parse to get all tokens that directly depend on
Tag: spacy
how to save space training
I have written an intent classification program. This is first trained with training data and then tested with test data. The training process takes a few seconds. What is the best way to save such a training, so that it does not have to be trained again with every call? Is it enough to save train_X and train_y? or does
How to resolve TypeError: cannot use a string pattern on a bytes-like object – word_tokenize, Counter and spacy
My dataset is a sales transactions history of an online store. I need to create a category based on the texts in the Description column. I have done some text pre-processing and clustering. This is how the dataframe cat_df head looks like: Description Text Cluster9 0 WHITE HANGING HEART T-LIGHT HOLDER white hanging heart t-light holder 1 1 WHITE METAL
How to create a list of tokenized words from dataframe column using spaCy?
I’m trying to apply spaCys tokenizer on dataframe column to get a new column containing list of tokens. Assume we have the following dataframe: The code below aims to tokenize Text column: The results looks like: Now, we have a new column tokens, which returns doc object for each sentence. How could we change the code to get a python
Add new pattern in Entity Ruler Spacy with regex in multiple tokens
I have this code that works well if I try to search exact words. But the regex doesnt work for whole sentence but just for each token. I tried to add something like this to add new entity but it doesnt still show the new label DIN in the output. What all am I doing wrong? How can I add
Using spacy to redact names from a column in a data frame
I have a data frame named “df1”. This data frame has 12 columns. The last column in this data frame is called notes. I need to replace common names like “john, sally and richard” from this column and replace the values with xxxx or something similar. I have a working script that is creating this data frame from MS SQL.
How to fix spaCy en_training incompatible with current spaCy version
spaCy version 3.2.1 Python version 3.9.7 OS Window Answer For spacy v2 models, the under-constrained requirement >=2.1.4 means >=2.1.4,<2.2.0 in effect, and as a result this model will only work with spacy v2.1.x. There is no way to convert a v2 model to v3. You can either use the model with v2.1.x or retrain the model from scratch with your
spacy Entity Ruler pattern isn’t working for ent_type
I am trying to get the entity ruler patterns to use a combination of lemma & ent_type to generate a tag for the phrase “landed (or land) in Baltimore(location)”. It seems to be working with the Matcher, but not the entity ruler I created. I set the override ents to True, so not really sure why this isn’t working. It
Cant load spacy en_core_web_trf
As the self guide says, I’ve installed it with (conda environment) I have spacy-transformers already installed. But when I simply do: It shows me this error: More info about the error: Answer Are you sure you did install spacy-transformers? After installing spacy? I am using pip: pip install spacy-transformers and I have no problems loading the en_core_web_trf.
Problem to covert data from CoNLL format to spacy format
How can I covert data from CoNLL format to spacy format? I’ve executed current code following similar Q&A on stackoverflow: How to convert from CoNLL format to spacy format. CoNLL spacyformat However, I cannot fix the error. Code Error Message I’ve read the document, spacy convert, but have no idea how to fix the error. Environment Python 3.9.1 spaCy version