Skip to content
Advertisement

Python – how to overwrite output from Markdown library

I have following problem, i’m using markdown library for my webapp and i need to modify the output generated by it, namely i want to change default <img src="..."> tag into <img data-src="...">. What would be the best way to change the html generated by this module?

Advertisement

Answer

You probably want to use Python-Markdown’s Extension API. Most people use the API to add their own syntax, but it can alter the existing output just as easily. That way, you can use Markdown’s parser but get your desired output. No need for wrappers or parsing twice.

In your case, you want to override (subclass) the ImagePattern class and define your own Element which is returned by the handleMatch method. Then you just need to tell Markdown about it. In your case, the regex doesn’t even need to be different. Just import and reuse the existing IMAGE_LINK_RE and override inlinePatterns["image_link"].

This tutorial should get you started. While it is implementing a different syntax, the basics are the same and it is a lot shorter than the API docs. See also part 1.

For completeness, if you use the reference syntax for you images, you would need to do the same thing with the ImageReferencePattern. You may find it easier to implement as a TreeProcessor instead (which I believe is what @Kos was referring to in his comment to the original post). That way the existing parser builds the existing output, but before serializing the ElementTree to text, you can loop through all the img tags and alter them to fit your needs. As an example, the HeaderId Extension does this to add IDs to h1-6 tags.

User contributions licensed under: CC BY-SA
5 People found this is helpful
Advertisement