Tag: nlp

Position of that Noun and Verb

I have a rule-based code that prints out the Noun which is followed by a verb in a sentence The output of a sentence following this rule: high school football players charged after video surfaces showing hazing trump accuser pushes new york to pass the adult survivors act plans to sue Is there a way to also print out the

A* search algorithm implementation in python

algorithm nlp python

I am trying to build a very simple A* Search Algorithm in Python 3. Given the following distances for each node (considering S is the starting node and G the end one) I want to write a function that finds the best path based on total cost (i.e., f(n) for those familiar with the terminology) for the following search space:

Create dictionary of context words without stopwords

dictionary nlp python string synonym

I am trying to create a dictionary of words in a text and their context. The context should be the list of words that occur within a 5 word window (two words on either side) of the term’s position in the string. Effectively, I want to ignore the stopwords in my output vectors. My code is below. I can get

Training, Validation and Test sets for imbalanced datasets in Machine Learning

classification machine-learning nlp python scikit-learn

I am working on an NLP task for a classification problem. My dataset is imbalanced and some authors have only 1 text, and thus I want to have this text only in the training set. As for the other authors I need to split the dataset into 70% training set, 15% validation set and 15% test set. I tried to

spacy Entity Ruler pattern isn’t working for ent_type

nlp python spacy spacy-3

I am trying to get the entity ruler patterns to use a combination of lemma & ent_type to generate a tag for the phrase “landed (or land) in Baltimore(location)”. It seems to be working with the Matcher, but not the entity ruler I created. I set the override ents to True, so not really sure why this isn’t working. It

Python – How to loop through each index position in a list?

for-loop linguistics nlp python

Given a list [[[“source1”], [“target1”], [“alignment1”]], [“source2”], [“target2”], [“alignment2”]], …] , I want to extract the words in the source that align with the words in the target. For example, in the English-German sentence pair The hat is on the table . – Der Hut liegt auf dem Tisch ., I want to print the following: So I have written

How to correctly pass a split function to TextVectorization layer

keras nlp python tensorflow

I’m defining a custom split callable for TextVectorization like this: resulting in: as seen above the split function is working correctly outside of the TextVectorization layer but failes when passed as a callable Answer Your split_slash function does not seem to properly tokenize the phrases. It is probably because your TextVectorization layer strips your phrases of all punctuation including /

BERT get sentence embedding

bert-language-model huggingface-tokenizers huggingface-transformers nlp python

I am replicating code from this page. I have downloaded the BERT model to my local system and getting sentence embedding. I have around 500,000 sentences for which I need sentence embedding and it is taking a lot of time. Is there a way to expedite the process? Would sending batches of sentences rather than one sentence at a time

How to combine string from one column to another column at same index in pandas DataFrame?

dataframe nlp pandas python

I was doing a project in nlp. My input is: I need output like this: How can I achieve this? Answer You can use groupby+transform(‘max’) to replace the empty cells with the letter per group as the letters have precedence over space. The rest is a simple string concatenation per column: Used input: NB. I considered “index” to be a