Skip to content
Advertisement

Prepend a url to all relative image links in a markdown document

I have a bunch of markdown documents with a mix of relative and absolute image destinations. e.g.

This is some text

![optional caption](/sub/folder/image.png)

And more text

![](https://example.com/cool_image.png)

I want to prepend a URL to each of the relative images, e.g. to change the above to

This is some text

![optional caption](https://some-image-host/image-host-subpath/sub/folder/image.png)

And more text

![](https://example.com/cool_image.png)

but preferably without hard-coding /sub/folder/ into the replace script (which is how I currently do it).

Is there a clever way to do this with awk or sed or is that a bad idea due to markdown having more edge cases than one expects?

I made some progress with https://pypi.org/project/marko/, e.g.

import marko
with open("myfile.md") as f: s = f.read()

doc = marko.inline.parser.parse_inline(s)

for i, e in eumerate(doc):
    if type(e) == marko.inline.Image:
        if not e.dest.startswith("http"):
            doc[i].dest = "https://some-image-host/image-host-subpath/" + doc[i].dest

which finds all the images and updates the destination of each relative image with the URL, but I’m not quite sure how to render this list of inline elements back into a markdown string again, and I figured I would post here first before re-inventing the wheel in case there is a much simpler way of doing this.

TIA for any help.

Advertisement

Answer

This command will do it without altering the original file in-place:

sed 's_(^![.*]()_1https://some-image-host/image-host-subpath_' <input_file

Once you’ve confirmed it’s what you want, you just need to add -i after the sed and before the 's_... and also remove the < before input_file:

sed -i 's_(^![.*]()_1https://some-image-host/image-host-subpath_' input_file

The way the command works is as follows:

  • I’m using _ as the pattern delimiters instead of the more common /, because it means I don’t have to escape every / in the path name.
  • This pattern ^![.*]( matches up to where you want to add the path.
  • I put the above pattern between the ( and the ) to remember it for later.
  • It’s added back with 1, followed by the path.

A simpler way would have been to simply replace the ]( part of the line with ])your_url_here:

sed 's_](_](https://some-image-host/image-host-subpath/_' <test

but it’s possible that the ]( combination might be found on other lines of your files and so I opted for the stronger test ^![.*]( which only matches lines beginning with ![ and has some stuff before ](.

User contributions licensed under: CC BY-SA
10 People found this is helpful
Advertisement