Python find all occurrences of hyphenated word and replace at position

Question

I have to replace all occurrences of patterns with hyphen like c-c-c-c-come or oh-oh-oh-oh, etc. with the last token i.e. come or oh in this example, where The number of character between hyphen is arbitrary, it can be one ore more characters the token to match is the last token in the hyphenation, hence come…

Accepted Answer

An option using a capturing group and a backreference might be:(?<!S)(w{2,3})(?:-1)*-(w+)(?!S)That will match:(?<!S) Negative lookbehind, assert what is on the left is not a non whitespace char(w{2,3}) Capture in group 1 two or three times a word char(?:-1)* Repeat 0+ times matching a hyphen followed by a backreference to what is matched in group 1-(w+) Match - followed by matching 1+ word chars in group 2(?!S) Negative lookahead, assert what is on the right is not a non whitespace charIn the replacement use the second capturing group \2 or r'2Regex demo | Python demoFor exampleimport retext = "c-c-c-c-come oh-oh-oh-oh it's a bad life oh-oh-oh-oh"pattern = r"(?<!S)(w{1,3})(?:-1)*-(w+)(?!S)"text = re.sub(pattern, r'2', text)print(text)Resultcome oh it's a bad life oh

Advertisement

Answer