Want to store Getting result of below code in data frame.
Two columns one is the actual name and another is each synonym in the new row.
import nltk
from nltk.corpus import wordnet
import pandas as pd
List = ['protest','riot','conflict']
df=[]
def process_genre(str):
for genre in str:
result = []
for syn in wordnet.synsets(genre):
for l in syn.lemmas():
result.append(l.name())
print(set(result))
process_genre(List)
output:
-------
{'resist', 'objection', 'dissent', 'protestation', 'protest'}
{'bacchanalia', 'riot', 'saturnalia', 'belly_laugh', 'scream', 'wow', 'bacchanal', 'thigh-slapper', 'sidesplitter', 'drunken_revelry', 'carouse', 'rioting', 'roister', 'debauchery', 'orgy', 'public_violence', 'howler', 'debauch'}
{'fight', 'battle', 'difference', 'dispute', 'conflict', 'infringe', 'engagement', 'struggle', 'difference_of_opinion', 'contravene', 'run_afoul'}
Want to store the result in data frame:
# Expected Result: Col1 Col2 -------------------- protest resist protest objection protest dissent ... ... riot scream riot carouse riot saturnalia ... ... conflict Fight conflict battle ... ...
Advertisement
Answer
This is a possible solution:
from nltk.corpus import wordnet
import pandas as pd
def process_genres(genres):
return (pd.DataFrame([(genre, l.name())
for genre in genres
for syn in wordnet.synsets(genre)
for l in syn.lemmas()], columns=['Col1', 'Col2'])
.drop_duplicates())
Here’s how you can use it:
>>> genres = ['protest', 'riot', 'conflict']
>>> df = process_genres(genres)
>>> df
Col1 Col2
0 protest protest
1 protest protestation
...
11 riot riot
12 riot public_violence
13 riot rioting
...
34 conflict conflict
35 conflict struggle
36 conflict battle
...
53 conflict contravene