I have a data frame like this:
text, pos No thank you. [(No, DT), (thank, NN), (you, PRP)] They didn't respond [(They, PRP), (didn't, VBP), (respond, JJ)]
I want o apply a function on pos
and save the result in a new column. So the output would look like this:
text, pos score No thank you. [(No, DT), (thank, NN), (you, PRP)] [[0.0, 0.0, 1.0], [], [0.5, 0.0, 0.45]] They didn't respond [(They, PRP), (didn, VBP), (respond, JJ)] [[0.0, 0.0, 1.0], [], [0.75, 0.0, 0.25]]
So the function return a list for each tuple in the list (but the implementation of the function is not the point here, for that I just call get_sentiment
).
I can do it using the nested loop but I didn’t like it. I want to do it using a more pythonic and Pandas Dataframe way:
This is what I have tried so far:
df['score'] = df['pos'].apply(lambda k: [get_sentiment(x,y) for j in k for (x,y) in j])
However, it raises this error:
ValueError: too many values to unpack (expected 2)
There is a couple of question in so but the answers was in R.
for more clarity:
get_sentiment
function is a function in NLTK
that assigns a list of score to each word (The list is [positive score, negative score, objectivity score]
). Overall, I need to apply that function on top of the pos
column of my Dataframe.
Advertisement
Answer
In your case
df['score'] = df['pos'].apply(lambda k: [get_sentiment(j[0],j[1]) for j in k ])