Skip to content
Advertisement

Why isn’t my re.sub finding all instances using my regex?

I’m using Python 3.10 on Windows 10 and trying the search below:

re.sub(r'(.*[A-Z]+[a-z]+)([A-Z])', r'1 2', 'JohnnyB Cool & JoeCool')
'JohnnyB Cool & Joe Cool'

If I use just “JohnnyB Cool”, the “B” gets a space before it.

re.sub(r'(.*[A-Z]+[a-z]+)([A-Z])', r'1 2', 'JohnnyB Cool')
'Johnny B Cool'

Why isn’t the “JohnnyB” substituted in the first search? I’ve also tried:

re.sub(r'(.*)([A-Z]+[a-z]+)([A-Z])', r'1 2 3', 'JohnnyB Cool & JoeCool')
'JohnnyB Cool &  Joe Cool'

To be clear, I want the final answer to be, Johnny B Cool & Joe Cool.

Advertisement

Answer

You may use this python code:

>>> import re
>>> s = 'JohnnyB Cool & JoeCool'
>>> print (re.sub(r'B[A-Z]', r' g<0>', s))
Johnny B Cool & Joe Cool

RegEx Demo

Explanation:

  • B matches where b doesn’t i.e. adjacent to another word character
  • [A-Z] matches an uppercase letter
User contributions licensed under: CC BY-SA
1 People found this is helpful
Advertisement