Want to store Getting result of below code in data frame.
Two columns one is the actual name and another is each synonym in the new row.
import nltk from nltk.corpus import wordnet import pandas as pd List = ['protest','riot','conflict'] df=[] def process_genre(str): for genre in str: result = [] for syn in wordnet.synsets(genre): for l in syn.lemmas(): result.append(l.name()) print(set(result)) process_genre(List) output: ------- {'resist', 'objection', 'dissent', 'protestation', 'protest'} {'bacchanalia', 'riot', 'saturnalia', 'belly_laugh', 'scream', 'wow', 'bacchanal', 'thigh-slapper', 'sidesplitter', 'drunken_revelry', 'carouse', 'rioting', 'roister', 'debauchery', 'orgy', 'public_violence', 'howler', 'debauch'} {'fight', 'battle', 'difference', 'dispute', 'conflict', 'infringe', 'engagement', 'struggle', 'difference_of_opinion', 'contravene', 'run_afoul'}
Want to store the result in data frame:
# Expected Result: Col1 Col2 -------------------- protest resist protest objection protest dissent ... ... riot scream riot carouse riot saturnalia ... ... conflict Fight conflict battle ... ...
Advertisement
Answer
This is a possible solution:
from nltk.corpus import wordnet import pandas as pd def process_genres(genres): return (pd.DataFrame([(genre, l.name()) for genre in genres for syn in wordnet.synsets(genre) for l in syn.lemmas()], columns=['Col1', 'Col2']) .drop_duplicates())
Here’s how you can use it:
>>> genres = ['protest', 'riot', 'conflict'] >>> df = process_genres(genres) >>> df Col1 Col2 0 protest protest 1 protest protestation ... 11 riot riot 12 riot public_violence 13 riot rioting ... 34 conflict conflict 35 conflict struggle 36 conflict battle ... 53 conflict contravene