Extracting codes with regex (irregular regex keys)

Question

I´m extracting the codes from a string list using coming from the title email. Which looks something like: So far what I tried is: My issue is that, I´m not able to extract the code next to the words that goes before ['PN', 'P/N', 'PN:', 'P/N:'], specially if the code after starts with a letter (i.e 'M') or if it

Accepted Answer

In your pattern the character class [p/n:]s+ will match one of the listed followed by 1+ whitespace chars. In the example data that will match a forward slash or a colon followed by a space.The next part (?:w+(?:s+|$)) will match 1+ word characters followed by either the end of the string or 1+ whitespace chars without taking a whitespace char in the middle or a hyphen into account.One option is to match PN with an optional : and / instead of using a character class [p/n:] and have your value in a capturing group:/ P/?N:? ([w-]+)Regex demo | Python demoFor example:import retext_list = ['Industry / Gemany / PN M564839', 'Industry / France / PN: 575-439', 'Telecom / Gemany / P/N 26-59-29', 'Mobile / France / P/N: 88864839']regex = r"/ P/?N:? ([w-]+)"res = []for text in text_list:     matches = re.search(regex, text)    if matches:        res.append(matches.group(1))print(res)Result['M564839', '575-439', '26-59-29', '88864839']

Advertisement

Answer