Skip to content
Advertisement

How to remove sub-string starting and ending with something?

How can I remove a sub-string from a string starting and ending with a certain character combination like:

' bla <span class=""latex""> ... This can be different1 ... </span> blub <span class=""latex""> ... This can be different2 ... </span> bleb'

That I want as result:

'bla blub bleb'

I tried something like this

string.replace('<span class=""latex"">' * '</span>', '')

but this does not work.

Is there a way to implement this?

Advertisement

Answer

This could work:

>>> import re
>>> x=re.sub(r"""<span class=""latex"">.+?</span>""", "", s)

>>> x
' bla  blub  bleb'

Regex101

EDIT : after clarification by the OP, changed the answer to use lazy quantifier instead of capturing group. While this works, it is not scalable to more complex cases. If that is the case, the proper solution would be to parse the string and extract what is needed.

User contributions licensed under: CC BY-SA
3 People found this is helpful
Advertisement