Skip to content
Advertisement

How to substitute only second occurrence of re.search() group

I need to replace part of the string value with extra zeroes if it needs.

T-46-5-В,Г,6-В,Г —> T-46-005-В,Г,006-В,Г or T-46-55-В,Г,56-В,Г —> T-46-055-В,Г,066-В,Г, for example.

I have Regex pattern ^D-d{1,2}-([d,]+)-[а-яА-я,]+,([d,]+)-[а-яА-я,]+$ that retrieves 2 separate groups of the string, that i must change. The problem is I can’t substitute back exact same groups with changed values if there is another occurrence of my re.search().group() in the whole string.

import re

my_string = "T-46-5-В,Г,6-В,Г"
my_pattern = r"^D-d{1,2}-([d,]+)-[а-яА-я,]+,([d,]+)-[а-яА-я,]+$"

new_string_parts = ["005", "006"]
new_string = re.sub(re.search(my_pattern, my_string).group(1), new_string_parts[0], my_string)
new_string = re.sub(re.search(my_pattern, my_string).group(2), new_string_parts[1], new_string)
print(new_string)

I get T-4006-005-В,Г,006-В,Г instead of T-46-005-В,Г,006-В,Г because there is another “6” in my_string. How can i solve this? Thanks for your answers!

Advertisement

Answer

Capture the parts you need to keep and use a single re.sub pass with unambiguous backreferences in the replacement part (because they are mixed with numeric string variables):

import re

my_string = "T-46-5-В,Г,6-В,Г"
my_pattern = r"^(D-d{1,2}-)[d,]+(-[а-яёА-ЯЁ,]+,)[d,]+(-[а-яёА-ЯЁ,]+)$"

new_string_parts = ["005", "006"]
new_string = re.sub(my_pattern, fr"g<1>{new_string_parts[0]}g<2>{new_string_parts[1]}3", my_string)
print(new_string)
# => T-46-005-В,Г,006-В,Г

See the Python demo. Note I also added ёЁ to the Russian letter ranges.

The pattern – ^(D-d{1,2}-)[d,]+(-[а-яёА-ЯЁ,]+,)[d,]+(-[а-яёА-ЯЁ,]+)$ – now contains parentheses around the parts you do not need to change, and g<1> refers to the string captured with (D-d{1,2}-), g<2> refers to the value captured with (-[а-яёА-ЯЁ,]+,) and 3 – to (-[а-яёА-ЯЁ,]+).

Advertisement