What would be the regex pattern for the following?

Question

I have multiple regex strings in format:- Example: A=&#8217;AB.224-QW-2018&#8242; B=&#8217;AB.876-5-LS-2018&#8242; C=&#8217;AB.26-LS-18&#8242; D=&#8217;AB-123-6-LS-2017&#8242; E=&#8217;IA-Mb-22L-AB.224-QW-2018-IA-Mb-22L&#8217; F=&#8217;ZX-ss-12L-AB-123-6-LS-2017-BC-22&#8242; G=&#8217;AB.224-2018&#8242; H=&#82…

Accepted Answer

You could use 3 capture groups:b(AB)D*(d+)S*?(?:20)?(dd)bb A word boundary to prevent a partial word match(AB) Capture AB in group 1D* Match optional non digits(d+) Capture 1+ digits in group 2S*? Optionally match non whitespace characters, as least as possible(?:20)? Optionally match 20(dd) Capture 2 digits in group 3b A word boundaryRegex demoFor example using re.finditer which returns Match objects that each hold the group values.Using enumerate you can loop the matches. Every item in the iteration returns a tuple, where the first value is the count (that you don&#8217;t need here) and the second value contains the Match object.import repattern = r"b(AB)D*(d+)S*?(?:20)?(dd)b"s = ("A='AB.224-QW-2018'n"            "B='AB.876-5-LS-2018'n"            "C='AB.26-LS-18'n"            "D='AB-123-6-LS-2017'n"            "IA-Mb-22L-AB.224-QW-2018-IA-Mb-22L' F='ZX-ss-12L-AB-123-6-LS-2017-BC-22n"            "A='AB.224-QW-2018'n"            "B='AB.876-5-LS-2018'n"            "C='AB.26-LS-18'n"            "D='AB-123-6-LS-2017'n"            "E='IA-Mb-22L-AB.224-QW-2018-IA-Mb-22L'n"            "F='ZX-ss-12L-AB-123-6-LS-2017-BC-22'n"            "G='AB.224-2018'n"            "H='AB.224/QW/2018'n"            "I='AB/224/2018'")matches = re.finditer(pattern, s)for _, m in enumerate(matches, start=1):    print(m.group(1) + "/" + m.group(2) + "/" + m.group(3))OutputAB/224/18AB/876/18AB/26/18AB/123/17AB/224/18AB/123/17AB/224/18AB/876/18AB/26/18AB/123/17AB/224/18AB/123/17AB/224/18AB/224/18AB/224/18

Advertisement

Answer