Skip to content

Tag: nlp

What does this “.children” attribute do?

I’m trying to understand a Key-Bigram extractor’s working and I cannot understand what does the following block of code do. Here is the source code. Everything else is workin fine and I understood well, however I can not understand what child for child in possible_words.children does. Answer token…

Gensim Word2Vec exhausting iterable

I’m getting the following prompt when calling model.train() from gensim word2vec The only solutions I found on my search for an answer point to the itarable vs iterator difference, and at this point, I tried everything I could to solve this on my own, currently, my code looks like this: The corpus varia…

Regex: searching for words that starts with @ or @

I want to create a regex in python that find words that start with @ or @. I have created the following regex, but the output contains one extra space in each string as you can see However, the output that I want to have is the following I would be grateful if you could help me! Edit: @The fourth

How to handle numbers embedded in text during NLP pre-processing?

I am trying to run the LDA algorithm on a data set of news articles. I understand that numbers must be removed during the pre-processing step, and I have written a simple regex code to replace numbers with blanks. However, I would like to retain some numbers since removing them can potentially change the cont…