Skip to content
Advertisement

regular expression match starting clause with end

I want to be able to capture the value of an HTML attribute with a python regexp. currently I use

JavaScript

My problem is that I want the regular expression to “remember” whether the attribute started with a single or a double quote.

I found the bug in my current approach with the following attribute

JavaScript

my regex catches

JavaScript

Advertisement

Answer

You can capture the first quote and then use a backreference:

JavaScript

However, regular expressions are not the proper approach to parsing HTML. You should consider using a DOM parser instead.

Advertisement