Skip to content
Advertisement

Python subprocess – git log wrong JSON Format

I tried to format git log to json but failed miserabely.

I used this command for the formatting, and I don’t think this is where my problem lies, but hey you never know.

These are my functions.

def call_git_log():
    format: str = '{%n  "commit": "%H",%n  "abbreviated_commit": "%h",%n  "tree": "%T",%n  "abbreviated_tree": "%t",%n  "parent": "%P",%n  "abbreviated_parent": "%p",%n  "refs": "%D",%n  "encoding": "%e",%n  "subject": "%s",%n  "sanitized_subject_line": "%f",%n  "body": "%b",%n  "commit_notes": "%N",%n  "verification_flag": "%G?",%n  "signer": "%GS",%n  "signer_key": "%GK",%n  "author": {%n    "name": "%aN",%n    "email": "%aE",%n    "date": "%aD"%n  },%n  "commiter": {%n    "name": "%cN",%n    "email": "%cE",%n    "date": "%cD"%n  }%n},'
    output = subprocess.Popen(["git", "log", f"--pretty=format:{format}"], stdout=subprocess.PIPE, stderr=subprocess.PIPE)
    stdout, stderr = output.communicate()
    return stdout.decode("utf-8")
output = call_git_log()
    with open("output/test.json", "w") as file:
        print(str(output), file=file)

As a result I get this file – in the wrong JSON Format. Why is this and what is wrong. output/test.json

{
  "commit": "4099117e564e7106b7ee7e315e3e8b8458a8fdce",
  "abbreviated_commit": "4099117",
  "tree": "6b1eb2fbf81de876d14781ffa82b5ee5db973af6",
  "abbreviated_tree": "6b1eb2f",
  "parent": "37445d7254f726801ce5ed067b8ee8eb523b8b99",
  "abbreviated_parent": "37445d7",
  "refs": "HEAD -> master, master/master",
  "encoding": "",
  "subject": "ue04-plots - A.3 fertig",
  "sanitized_subject_line": "ue04-plots-A.3-fertig",
  "body": "",
  "commit_notes": "",
  "verification_flag": "N",
  "signer": "",
  "signer_key": "",
  "author": {
    "name": "Lorenz Bauer",
    "email": "",
    "date": "Wed, 14 Dec 2022 00:50:33 +0100"
  },
  "commiter": {
    "name": "Lorenz Bauer",
    "email": "",
    "date": "Wed, 14 Dec 2022 00:50:33 +0100"
  }
},
{
  "commit": "37445d7254f726801ce5ed067b8ee8eb523b8b99",
  "abbreviated_commit": "37445d7",
  "tree": "caa7df1bd70b5fd2319e903331c2a96d80f08152",
  "abbreviated_tree": "caa7df1",
  "parent": "cb484ec66468c5bbac1f78a8ed87852202207701",
  "abbreviated_parent": "cb484ec",
  "refs": "",
  "encoding": "",
  "subject": "ue04-plots - arrows",
  "sanitized_subject_line": "ue04-plots-arrows",
  "body": "",
  "commit_notes": "",
  "verification_flag": "N",
  "signer": "",
  "signer_key": "",
  "author": {
    "name": "Lorenz Bauer",
    "email": "",
    "date": "Wed, 14 Dec 2022 00:48:45 +0100"
  },
  "commiter": {
    "name": "Lorenz Bauer",
    "email": "",
    "date": "Wed, 14 Dec 2022 00:48:45 +0100"
  }
},
{
  "commit": "cb484ec66468c5bbac1f78a8ed87852202207701",
  "abbreviated_commit": "cb484ec",
  "tree": "73e2e71396290d9627b9301451ca5a1bb7ba6df4",
  "abbreviated_tree": "73e2e71",
  "parent": "becd22ff715defbe00e064181ee71266e3d1db45",
  "abbreviated_parent": "becd22f",
  "refs": "",
  "encoding": "",
  "subject": "ue04-plots - titel",
  "sanitized_subject_line": "ue04-plots-titel",
  "body": "",
  "commit_notes": "",
  "verification_flag": "N",
  "signer": "",
  "signer_key": "",
  "author": {
    "name": "Lorenz Bauer",
    "email": "",
    "date": "Wed, 14 Dec 2022 00:33:59 +0100"
  },
  "commiter": {
    "name": "Lorenz Bauer",
    "email": "",
    "date": "Wed, 14 Dec 2022 00:33:59 +0100"
  }
},

What do I have to change to make this a valid JSON document, which json.loads() can process.

Advertisement

Answer

Looks like you are manipulating a git log‘s output, making it a JSON file, then you’ll transfer it to some other JSON parser, and found an error there?

Yes, your output is not a valid JSON: As an “array”, a bracket wrapping the beginning and end are expected.

See https://stackoverflow.com/a/4600561/9035237https://gist.github.com/textarcana/1306223 for a post-processing example. All the code in your mentioned link said this too.

If you are using Python, you may:

output = "[" + output + "]"
output = output.replace("},]", "}]")

However, there are still problems in your format: JSON doesn’t accept a line separator inside string, and a " in any field will break the format forever, but these will probably happen in a commit message. So your format should change.

As per https://gist.github.com/varemenos/e95c2e098e657c7688fd?permalink_comment_id=3260906#gistcomment-3260906 says, you can do a hack: use some string that will probably not occur in any field, for example ^^^^, as a temporary quote placeholder, then do any character escaping, for example n\n and "\", and ^^^^" at last. Don’t do JSON prettify at this step, hand it up to a JSON formatter.

User contributions licensed under: CC BY-SA
5 People found this is helpful
Advertisement