I’m getting the following prompt when calling model.train() from gensim word2vec The only solutions I found on my search for an answer point to the itarable vs iterator difference, and at this point, I tried everything I could to solve this on my own, currently, my code looks like this: The corpus varia…
Tag: gensim
Retrieve n-grams with word2vec
I have a list of texts. I turn each text into a token list. For example if one of the texts is ‘I am studying word2vec’ the respective token list will be (assuming I consider n-grams with n = 1, 2, 3) [‘I’, ‘am’, ‘studying ‘, ‘word2vec, ‘I am’,…
disable logging for specific lines of code
I am tuning the word2vec model hyper-parameters. Word2Vec has to many log in console that I cannot read Optuna or my custom log. Is there any trick to suppress logs generated by Word2Vec? Answer I used following code in python 3.7 in python 3.6 we have send logging.ERROR to disable function.
Modifying .trainables.syn1neg[i] with previously trained vectors in Gensim word2vec
My issue is the following. In my code I’m modifying the .wv[word] before training but after .build_vocab(), which is fairly straight forward. Just instead of the vectors in there add mine for every word. Where setIntersection is just a set of common words between gensim word2vec and RandomIndexing train…
training a Fasttext model
I want to train a Fasttext model in Python using the “gensim” library. First, I should tokenize each sentences to its words, hence converting each sentence to a list of words. Then, this list should be appended to a final list. Therefore, at the end, I will have a nested list containing all tokeni…
Gensim LDA Coherence Score Nan
I created a Gensim LDA Model as shown in this tutorial: https://www.machinelearningplus.com/nlp/topic-modeling-gensim-python/ And it generates 10 topics with a log_perplexity of: lda_model.log_perplexity(data_df[‘bow_corpus’]) = -5.325966117835991 But when I run the coherence model on it to calcul…
Doc2Vec find the similar sentence
I am trying find similar sentence using doc2vec. What I am not able to find is actual sentence that is matching from the trained sentences. Below is the code from this article: But the above code only gives me vectors or numbers. But how can I get the actual sentence matched from training data. For Eg –…
CalledProcessError: Returned non-zero exit status 1
When I try to run: I get the following error: What can I do in my code specifically to make it work? Furthermore, the question on this error has been asked a few times before. However, each answer seems so specific to a particular case, that I don’t see what I can change on my code now so that it
Measure similarity between two documents using Doc2Vec
I have already trained gensim doc2Vec model, which is finding most similar documents to an unknown one. Now I need to find the similarity value between two unknown documents (which were not in the training data, so they can not be referenced by doc id) in the code above vec1 and vec2 are successfully initiali…