I’ve written some code, but it does not output what I expected.
Here is the code:
query_words = ['dollar', 'probabilistic'] query_word_to_synonym_dict = {'probabilistic': ['probabilistic'], 'dollar' : ['currency']} mail_ids = {123, 108} big_ds = {} empty_dict = {} index = {'probabilistic':{(108, 1)}, 'currency':{(123, 1)}} for mail_id in mail_ids: empty_dict = dict.fromkeys(query_words, []) big_ds.update({mail_id:empty_dict}) for query_word in query_words: syns = query_word_to_synonym_dict[query_word] for syn in syns: index_of_word = index[syn] tuple_first = [] for tuples in index_of_word: tuple_first.append(tuples[0]) for number in tuple_first: (big_ds[number][query_word]).append(syn) print(big_ds)
The expected final value of big_ds
is:
{123: {'dollar': ['currency'], 'probabilistic': []}, 108: {'dollar': [], 'probabilistic': ['probabilistic']}}
But the code sets the value of big_ds
to the following:
{123: {'dollar': ['currency'], 'probabilistic': ['currency']}, 108: {'dollar': ['probabilistic'], 'probabilistic': ['probabilistic']}}
I asked a similar question a while back: How do I resolve this unexpected output in Python code? and was able to solve the issue for that use case. But that code I wrote fails when query_words
has a size>1.
I can’t seem to figure out how to fix things. Any solution?
Advertisement
Answer
It’s because:
dict.fromkeys(query_words, [])
…the keys in each mail_id sub-dict each share the same list instance.
See:
- “Least Astonishment” and the Mutable Default Argument
- Dictionary creation with fromkeys and mutable objects. A surprise
Try this instead:
query_words = ['dollar', 'probabilistic'] query_word_to_synonym_dict = {'probabilistic': ['probabilistic'], 'dollar' : ['currency']} mail_ids = {123, 108} big_ds = {} index = {'probabilistic':{(108, 1)}, 'currency':{(123, 1)}} for mail_id in mail_ids: big_ds[mail_id] = {word: [] for word in query_words} for query_word in query_words: syns = query_word_to_synonym_dict[query_word] for syn in syns: index_of_word = index[syn] tuple_first = [] for tuples in index_of_word: tuple_first.append(tuples[0]) for number in tuple_first: big_ds[number][query_word].append(syn) print(big_ds)