Skip to content
Advertisement

Extracting codes with regex (irregular regex keys)

I´m extracting the codes from a string list using coming from the title email. Which looks something like:

JavaScript

So far what I tried is:

JavaScript

My issue is that, I´m not able to extract the code next to the words that goes before ['PN', 'P/N', 'PN:', 'P/N:'], specially if the code after starts with a letter (i.e ‘M’) or if it has a slash between it (i.e 26-59-29).

My desired output would be:

JavaScript

Advertisement

Answer

In your pattern the character class [p/n:]s+ will match one of the listed followed by 1+ whitespace chars. In the example data that will match a forward slash or a colon followed by a space.

The next part (?:w+(?:s+|$)) will match 1+ word characters followed by either the end of the string or 1+ whitespace chars without taking a whitespace char in the middle or a hyphen into account.

One option is to match PN with an optional : and / instead of using a character class [p/n:] and have your value in a capturing group:

JavaScript

Regex demo | Python demo

For example:

JavaScript

Result

JavaScript
Advertisement