My goal is to obtain all the possible substitution for all overlapping patterns of a given regex.
Normally when I want to obtain a substitution with regex I do the following
import re re.sub(pattern='III', repl='U', string='MIIII')
and I would obtain the following output:
MUI
As stated in the documentation the substitution is done only on the leftmost match in case of overlap, but what I need is to obtain all the possible substitutions, which in this case are:
MUI MIU
My goal is using it also for complex regex patterns like the following
re.sub(pattern="M(.*)$", repl="M\1\1", string='MIU') MIUIU
I didn’t find any solutions natively in the python standard library
Advertisement
Answer
One of the way to implement this is to search for pattern (using re.search()
) until no match pattern found and replace just single occurrence of pattern (using re.sub()
with count
argument) slicing string every iteration to skip previous match.
import re source = "MMM123" pattern = re.compile("M(.*)$") replacement = r"M11" last_start = 0 temp = source while match := pattern.search(temp): print(source[:last_start], pattern.sub(replacement, temp, 1), sep="") last_start += match.start() + 1 temp = source[last_start:]