Skip to content
Advertisement

Python: Convert markdown table to json with

I am trying to figure out, what is the easiest way to convert some markdown table text into json using only python. For example, consider this as input string:

| Some Title | Some Description             | Some Number |
|------------|------------------------------|-------------|
| Dark Souls | This is a fun game           | 5           |
| Bloodborne | This one is even better      | 2           |
| Sekiro     | This one is also pretty good | 110101      |

The wanted output should be this:

[
    {"Some Title":"Dark Souls","Some Description":"This is a fun game","Some Number":5},
    {"Some Title":"Bloodborne","Some Description":"This one is even better","Some Number":2},
    {"Some Title":"Sekiro","Some Description":"This one is also pretty good","Some Number":110101}
]

Note: Ideally, the output should be RFC 8259 compliant, aka use double quotes ” instead of single quotes ‘ around they key value pairs.

I’ve seen some JS libraries that do that, but nothing for python only. Can someone explain to me whats the quickest way to achieve this, so I don’t have to write my own parser for this.

All help is appreciated!

Advertisement

Answer

You can treat it as a multi-line string and parse it line by line while splitting at n and |

Simple code that does that:

import json

my_str='''| Some Title | Some Description             | Some Number |
|------------|------------------------------|-------------|
| Dark Souls | This is a fun game           | 5           |
| Bloodborne | This one is even better      | 2           |
| Sekiro     | This one is also pretty good | 110101      |'''

def mrkd2json(inp):
    lines = inp.split('n')
    ret=[]
    keys=[]
    for i,l in enumerate(lines):
        if i==0:
            keys=[_i.strip() for _i in l.split('|')]
        elif i==1: continue
        else:
            ret.append({keys[_i]:v.strip() for _i,v in enumerate(l.split('|')) if  _i>0 and _i<len(keys)-1})
    return json.dumps(ret, indent = 4) 
print(mrkd2json(my_str))
[
    {
        "Some Title": "Dark Souls",
        "Some Description": "This is a fun game",
        "Some Number": "5"
    },
    {
        "Some Title": "Bloodborne",
        "Some Description": "This one is even better",
        "Some Number": "2"
    },
    {
        "Some Title": "Sekiro",
        "Some Description": "This one is also pretty good",
        "Some Number": "110101"
    }
]

PS: Don’t know about any library that does that, will update if I find anything!

Advertisement