How to extract exact match from list of strings in Python into separate lists

Question

This is an example list of strings I would like to extract XIC(Switch_A) into one list, OTE(Light1) into another list, TON(Motor_timer) into another list and so on. This is the code in Python 3 that I have tried How do I go about extracting OTE(Tag name), XIC(Tag name), XIO(Tag name) into their own lists or groups? Answer You can use

Accepted Answer

You can use the following regex to match any three uppercase letters, followed by anything in parentheses:([A-Z]{3})(([^)]+))(        )             : Capturing group 1          (         )  : Capturing group 2 [A-Z]{3}              : Exactly three uppercase letters           (     )   : Literal open/close parentheses             [^)]+     : One or more of any character that is not )Regex101Use a collections.defaultdict to keep track of all your results. The identifier will be the key for this defaultdict, and the values will be lists containing all the matches for that identifier.from collections import defaultdictresults = defaultdict(list)regex = re.compile(r"([A-Z]{3})(([^)]+))")for s in new_text:    matches = regex.findall(s)    for m in matches:         identifier = m[0]        results[identifier].append(m[0] + m[1])Which gives the following results:{'XIC': ['XIC(Switch_A)', 'XIC(Light1)', 'XIC(Light1)', 'XIC(Motor_timer.DN)'], 'OTE': ['OTE(Light1)', 'OTE(Light2)', 'OTE(Motor)']}Since you have a fixed set of identifiers, you can replace the [A-Z]{3} portion of the regex with something that will only match your identifiers:regex = re.compile(r"(XIC|XIO|OTE|TON|TOF)(([^)]+))")It is also possible to build this regex if you have your identifiers in an iterable:identifiers = ["XIC", "XIO", "OTE", "TON", "TOF"]regex = re.compile(rf"({'|'.join(identifiers)})(([^)]+))")

Advertisement

Answer