Skip to content
Advertisement

Validate (X)HTML in Python

What’s the best way to go about validating that a document follows some version of HTML (prefereably that I can specify)? I’d like to be able to know where the failures occur, as in a web-based validator, except in a native Python app.

Advertisement

Answer

XHTML is easy, use lxml.

JavaScript

HTML is harder, since there’s traditionally not been as much interest in validation among the HTML crowd (run StackOverflow itself through a validator, yikes). The easiest solution would be to execute external applications such as nsgmls or OpenJade, and then parse their output.

Advertisement