Skip to content
Advertisement

Converting a list into a dictionary of the form {name: [other names]} using a function

I currently have a lists of captions in the form of a list

print(valid_captions)

-> [' Les Lieberman, Barri Lieberman, Isabel Kallman, Trish Iervolino, and Ron Iervolino ', ' Chuck Grodin ', ' Diana Rosario, Ali Sussman, Sarah Boll, Jen Zaleski, Alysse Brennan, and Lindsay Macbeth ', ' Kelly Murro and Tom Murro ', ' Ron Iervolino, Trish Iervolino, Russ Middleton, and Lisa Middleton ']

I want to create a function that would iterate over each element of the list and create an adjacency listfor each person where I can get a list of unique names of all the folks that appear in the list within the data set. I want to represent this adjacency list as a python dictionary with each name as the key and the list of names they appear with as the values.

So the function would take a single caption and return a dictionary in the form of
name: [other names in caption]} for each name while removing any titles like Dr or Mayor.

As an example I would like this

[Dr .Ron Iervolino, Trish Iervolino, and Mayor.Russ Middleton]

to return

{'Ron Iervolino': ['Trish Iervolino', 'Russ Middleton'],
 'Trish Iervolino': ['Ron Iervolino', 'Russ Middleton'],
 'Russ Middleton': ['Ron Iervolino', 'Russ Middleton']}

f someone appears in a caption by themselves, return {name: []}. So the caption ‘Robb Stark’ would return {‘Robb Stark’: []}

I have a function to remove the titles, but I’m getting the adjacency list all wrong.

def remove_title(names):
    removed_list = []
    for name in names:
        altered_name = re.split('Dr |Mayor ', name)
        removed_list+=altered_name
    try:
        while True:
            removed_list.remove('')
    except:
        pass
    return removed_list

Advertisement

Answer

The following is my solution to the problem whereby I create a function that takes a caption and returns a dictionary of the form {name: [other names in caption]} for each name.

In the function, I cleaned up the captions using string manipulation functions at the very start to remove the titles like ‘Mayor’, ‘Dr’ while also stripping out ‘and’ from the captions. Then I also used strip() to remove any leading or trailing spaces. I incorporate try and except for any exception handling while removing individual elements of the prospective list and then using for loops for the rest of the process.

def format_caption(caption):
    name_list = re.split('Dr |Mayor |and |, ', caption)
    name_list = [name.strip() for name in name_list]
    name_dict = {}
    try:
        while True:
            name_list.remove('')
    except:
        pass
    for name in name_list:
        name_dict.update({name:[]})
    for key, name_list_2 in name_dict.items():
        for name in name_list:
            if name != key:
                name_list_2.append(name)
    return name_dict

The resulting function gives me the captions in the format I was looking for

list=['Dr .Ron Iervolino, Trish Iervolino, and Mayor.Russ Middleton']
print(format_caption(list))

>{'Ron Iervolino': ['Trish Iervolino', 'Russ Middleton'],
'Trish Iervolino': ['Ron Iervolino', 'Russ Middleton'],
'Russ Middleton': ['Ron Iervolino', 'Russ Middleton']}
User contributions licensed under: CC BY-SA
9 People found this is helpful
Advertisement