Skip to content
Advertisement

Keywords extraction in Python – How to handle hyphenated compound words

I’m trying to perform keyphrase extraction with Python, using KeyBert and pke PositionRank. You can see an extract of my code below.

JavaScript

and here the results:

JavaScript

I would like to handle hyphenated compound words (as life-cycle in the example) are considered as a unique word, but I cannot understand how to exclude the – from the words separators list.

Thank you in advance for any help. Francesca

Advertisement

Answer

this could be a silly workaround but it may help :

JavaScript

the out put should look like this:

JavaScript

I hope this help :)

User contributions licensed under: CC BY-SA
4 People found this is helpful
Advertisement