I have bunch of sentences in a list and I wanted to use nltk library to stem it. I am able to stem one sentence at a time, however I am having issues stemming sentences from a list and joining them back together. Is there a step I am missing? Quite new to nltk library. Thanks!
import nltk from nltk.stem import PorterStemmer ps = PorterStemmer() # Success: one sentences at a time data = 'the gamers playing games' words = word_tokenize(data) for w in words: print(ps.stem(w)) # Fails: data_list = ['the gamers playing games', 'higher scores', 'sports'] words = word_tokenize(data_list) for w in words: print(ps.stem(w)) # Error: TypeError: expected string or bytes-like object # result should be: ['the gamer play game', 'higher score', 'sport']
Advertisement
Answer
You’re passing a list to word_tokenize
which you can’t.
The solution is to wrap your logic in another for-loop
,
data_list = ['the gamers playing games','higher scores','sports'] for words in data_list: words = tokenize.word_tokenize(words) for w in words: print(ps.stem(w)) >>>>the gamer play game higher score sport