I have a TMX file containing source and target segments. Some of these segments are made up of several sentences. My goal is to segment these multi-sentence segments so that the entire TMX file consists of single-sentence segments. I intend to use spacy’s dependency parser to segment these multi-sentenc…
Tag: spacy
Get for each word the number of the sentences in which appears in a given text [closed]
Closed. This question needs details or clarity. It is not currently accepting answers. Want to improve this question? Add details and clarify the problem by editing this post. Closed 1 year ago. Improve this question I’m using Spacy and I am looking for a program that counts the frequencies of each word…
Given a word can we get all possible lemmas for it using Spacy?
The input word is standalone and not part of a sentence but I would like to get all of its possible lemmas as if the input word were in different sentences with all possible POS tags. I would also like to get the lookup version of the word’s lemma. Why am I doing this? I have extracted lemmas from all
Name Entity Recognition (NER) for multiple languages
I am writing some code to perform Named Entity Recognition (NER), which is coming along quite nicely for English texts. However, I would like to be able to apply NER to any language. To do this, I would like to 1) identify the language of a text, and then 2) apply the NER for the identified language. For step…
SpaCy NLP- Detect the verb form
As far as I know that we can get the v1 form of a verb using I wanted to know is their a way in which we can get the form of the verb like: swims it should output v4 Is their way to do that using SpaCy or any other lib and if there is then please give a
Meaningless Spacy Nouns
I am using Spacy for extracting nouns from sentences. These sentences are grammatically poor and may contain some spelling mistakes as well. Here is the code that I am using: Code Output: Similarly for sentence “fast foward2”, I get Spacy noun as Which shows that these nouns have some meaningless …
How to use LanguageDetector() from spacy_langdetect package?
I’m trying to use the spacy_langdetect package and the only example code I can find is (https://spacy.io/universe/project/spacy-langdetect): It’s throwing error: nlp.add_pipe now takes the string name of the registered component factory, not a callable component. So I tried using the below for add…
Can’t find SpaCy model when packaging with PyInstaller
I am using PyInstaller package a python script into an .exe. This script is using spacy to load up the following model: en_core_web_sm. I have already run python -m spacy download en_core_web_sm to download the model locally. The issue is when PyInstaller tries to package up my script it can’t find the …
Warning: [W108] The rule-based lemmatizer did not find POS annotation for the token ‘This’
What this message is about? How do I remove this warning message? Warning: [W108] The rule-based lemmatizer did not find POS annotation for the token ‘This’. Check that your pipeline includes components that assign token.pos, typically ‘tagger’+’attribute_ruler’ or ‘m…
spacy matcher returns right answer when two words are set as seperate ‘TEXT’ conditional object only. Why is it?
I’m trying to set a matcher finding word ‘iPhone X’. The sample code says I should follow below. I tried another approach by putting like below. Why is the second approach not working? I assumed if I put the two word ‘iPhone’ and ‘X’ together, it might work as the sam…