Skip to content
Advertisement

How to store synonyms as column in data frame?

Want to store Getting result of below code in data frame.

Two columns one is the actual name and another is each synonym in the new row.

import nltk
from nltk.corpus import wordnet
import pandas as pd

List = ['protest','riot','conflict']
df=[]
def process_genre(str):
    for genre in str:
        result = []
        for syn in wordnet.synsets(genre):
            for l in syn.lemmas():
                result.append(l.name())
        print(set(result))
process_genre(List)

output:
-------
{'resist', 'objection', 'dissent', 'protestation', 'protest'}
{'bacchanalia', 'riot', 'saturnalia', 'belly_laugh', 'scream', 'wow', 'bacchanal', 'thigh-slapper', 'sidesplitter', 'drunken_revelry', 'carouse', 'rioting', 'roister', 'debauchery', 'orgy', 'public_violence', 'howler', 'debauch'}
{'fight', 'battle', 'difference', 'dispute', 'conflict', 'infringe', 'engagement', 'struggle', 'difference_of_opinion', 'contravene', 'run_afoul'}

Want to store the result in data frame:

# Expected Result:

Col1           Col2
--------------------
protest       resist
protest       objection
protest       dissent
...           ...
riot          scream
riot          carouse
riot          saturnalia
...           ...
conflict      Fight
conflict      battle
...           ...


Advertisement

Answer

This is a possible solution:

from nltk.corpus import wordnet
import pandas as pd

def process_genres(genres):
    return (pd.DataFrame([(genre, l.name())
                          for genre in genres
                          for syn in wordnet.synsets(genre)
                          for l in syn.lemmas()], columns=['Col1', 'Col2'])
              .drop_duplicates())

Here’s how you can use it:

>>> genres = ['protest', 'riot', 'conflict']
>>> df = process_genres(genres)
>>> df
        Col1                   Col2
0    protest                protest
1    protest           protestation
...
11      riot                   riot
12      riot        public_violence
13      riot                rioting
...
34  conflict               conflict
35  conflict               struggle
36  conflict                 battle
...
53  conflict             contravene

User contributions licensed under: CC BY-SA
7 People found this is helpful
Advertisement