Tag: nlp

KeyError on a certain word

naivebayes nlp non-english python text-classification

I am trying to use Naive Bayes for spam-ham classification. I am getting a word error repeteadly on here: The error message is just this: ‘hafta’ is the first word of the pandas dataframe and the trainng dataset. I tried the solution on this issue that seemed similar to mine but it didn’t wo…

Create list of list tuples from reading a txt file

file nlp python token

I have a txt file that look likes And Im trying to make a tuples from this txt which ı will evalute them laterly word to features later on. I want to have a list of list look like this : All of the whitespaces indicates that the sentences over and should add to list to given index, laterly after

Count the number of times a group of words appear in a text

count nlp python text word-count

I have 4 lists of words that categorise something and a tokenised text by word. I would like to count the number of occurrences of the words in these lists in a certain text but as a sum of the words for each list. Therefore the results would show an occurrence of 10 animal words, 20 colour words, 6 food

Keywords extraction in Python – How to handle hyphenated compound words

nlp python

I’m trying to perform keyphrase extraction with Python, using KeyBert and pke PositionRank. You can see an extract of my code below. and here the results: I would like to handle hyphenated compound words (as life-cycle in the example) are considered as a unique word, but I cannot understand how to exclu…

Counting word frequency in a sentence

count nlp pandas python string

I have two columns – one with sentences and the other with single words. Sentence word “Such a day! It’s a beautiful day out there” “beautiful” “Such a day! It’s a beautiful day out there” “day” “I am sad by the sad weather” “…

Is there a way to find the antonym(word with the opposite meaning) of a word with python? Do you know a dataset or an nlp toolkit?

dataset nlp python

Thank you for your help! Answer NLTK is the main library for NLP and it includes many corpora. See the code here: How to generate a list of antonyms for adjectives in WordNet using Python NLTK documentation on using WordNet: https://www.nltk.org/howto/wordnet.html

Word2Vec + LSTM Good Training and Validation but Poor on Test

keras lstm nlp python word2vec

currently I’am training my Word2Vec + LSTM for Twitter sentiment analysis. I use the pre-trained GoogleNewsVectorNegative300 word embedding. The reason I used the pre-trained GoogleNewsVectorNegative300 because the performance much worse when I trained my own Word2Vec using own dataset. The problem is w…

Job type(Full Time , Part Time) detection with Machine learning model in Python

machine-learning nlp python python-3.x

I have a dataset of jobs where I have columns “Title” ,”Description” , “City” etc. and “Best Jobs” column. Output of the dataset is “Best Jobs” where I have two outputs(Yes , No) Yes mean jobs are part time and No , mean job is full time. I want to t…

How can I say that if I want to return an operation on a list, but it stays the same when it comes out null?

function isinstance map-function nlp python

I have a list-of-list of word groups in Turkish. I want to apply stemming and I found turkishnlp package. Although it has some shortcomings, it often returns the right word. However, when I apply this to the list, I don’t want the structure of my list to change and I want the words that he doesn’t…

How to label multi-word entities?

named-entity-recognition nlp pandas python training-data

I’m quite new to data analysis (and Python in general), and I’m currently a bit stuck in my project. For my NLP-task I need to create training data, i.e. find specific entities in sentences and label them. I have multiple csv files containing the entities I am trying to find, many of them consisti…