Skip to content
Advertisement

How to write a text parser logic which identifies keywords from a dictionary?

How to make a simple text parser that finds keywords and categorizes them accordingly.

Example: I have two dictionaries

A = {'1': 'USA', '2': 'Canada', '3': 'Germany'}
B = {'t1': "The temp in USA is x", 't2': 'Germany is very cold now', 't3': 'Weather in Canada is good', 't4': 'USA is cold right now'}

Now I want to pick out if the keywords from A are present in B and the result should be something like this.

Result = {'1': ('t1', 't4'), '2' : 't3', '3': 't2'}

I’m a beginner and the logic to get this is very confusing.

Advertisement

Answer

You can do this with a dict comprehension:

A = {'1': 'USA', '2': 'Canada', '3': 'Germany'}
B = {'t1': "The temp in USA is x", 't2': 'Germany is very cold now', 't3': 'Weather in Canada is good', 't4': 'USA is cold right now'}


{k: [k_b for k_b, v_b in B.items() if v in v_b.split()] for k, v in A.items()}
# {'1': ['t1', 't4'], '2': ['t3'], '3': ['t2']}

This makes every value in the dict a list rather than some being collections and others strings. That’s almost certainly going to be easier to work with than a mixed type dictionary.

If your dicts are going to be large, you might pick up some performance by inverting the B dictionary so you don’t need to scan through each value every time.

User contributions licensed under: CC BY-SA
3 People found this is helpful
Advertisement