Calculating the semantic descriptor of a nested list

Question

I am trying to calculate the semantic description of a nested list to turn it into a nested dictionary. First I got distinct_words, each word of it will be the keys of my final dictionary. EXPECTED OUTPUT: {'i': {'am': 3, 'a': 2, 'sick': 1, 'man': 3, 'spiteful': 1, 'an': 1, 'unattractive': 1, 'believe': 1, 'my': 2, 'liver': 1, 'is': 1,

Accepted Answer

Try this:from collections import defaultdictfrom itertools import productdef build_semantic_descriptors(sentences):    d = defaultdict(lambda: defaultdict(int))    for sentence in sentences:        should_skip_key = True        for (key, word) in product(sentence, sentence):            if key == word and should_skip_key:                should_skip_key = False                continue            d[key][word] += 1    return dif __name__ == '__main__':    x = [["i", "am", "a", "sick", "man"],          ["i", "am", "a", "spiteful", "man"],          ["i", "am", "an", "unattractive", "man"],          ["i", "believe", "my", "liver", "is", "diseased"],          ["however", "i", "know", "nothing", "at", "all", "about", "my",           "disease", "and", "do", "not", "know", "for", "certain", "what", "ails", "me"]]    print(build_semantic_descriptors(x))You need to loop each sentence twice, in order to get each word for each key. For this you can use itertools.product.Also note that I use here collections.defaultdict which you should read about, it is a nice utility that sets the dictionary with a default if the key does not exist (allowing to skip the check that you had)

Advertisement

Answer