Skip to content

Tag: word2vec

Word2Vec + LSTM Good Training and Validation but Poor on Test

currently I’am training my Word2Vec + LSTM for Twitter sentiment analysis. I use the pre-trained GoogleNewsVectorNegative300 word embedding. The reason I used the pre-trained GoogleNewsVectorNegative300 because the performance much worse when I trained my own Word2Vec using own dataset. The problem is why my training process had validation acc and loss stuck at 0.88 and 0.34 respectively. Then, my confussion

Retrieve n-grams with word2vec

I have a list of texts. I turn each text into a token list. For example if one of the texts is ‘I am studying word2vec’ the respective token list will be (assuming I consider n-grams with n = 1, 2, 3) [‘I’, ‘am’, ‘studying ‘, ‘word2vec, ‘I am’, ‘am studying’, ‘studying word2vec’, ‘I am studying’, ‘am studying word2vec’]. Is