I tried to format git log to json but failed miserabely.
I used this command for the formatting, and I don’t think this is where my problem lies, but hey you never know.
These are my functions.
def call_git_log(): format: str = '{%n "commit": "%H",%n "abbreviated_commit": "%h",%n "tree": "%T",%n "abbreviated_tree": "%t",%n "parent": "%P",%n "abbreviated_parent": "%p",%n "refs": "%D",%n "encoding": "%e",%n "subject": "%s",%n "sanitized_subject_line": "%f",%n "body": "%b",%n "commit_notes": "%N",%n "verification_flag": "%G?",%n "signer": "%GS",%n "signer_key": "%GK",%n "author": {%n "name": "%aN",%n "email": "%aE",%n "date": "%aD"%n },%n "commiter": {%n "name": "%cN",%n "email": "%cE",%n "date": "%cD"%n }%n},' output = subprocess.Popen(["git", "log", f"--pretty=format:{format}"], stdout=subprocess.PIPE, stderr=subprocess.PIPE) stdout, stderr = output.communicate() return stdout.decode("utf-8")
output = call_git_log() with open("output/test.json", "w") as file: print(str(output), file=file)
As a result I get this file – in the wrong JSON Format. Why is this and what is wrong. output/test.json
{ "commit": "4099117e564e7106b7ee7e315e3e8b8458a8fdce", "abbreviated_commit": "4099117", "tree": "6b1eb2fbf81de876d14781ffa82b5ee5db973af6", "abbreviated_tree": "6b1eb2f", "parent": "37445d7254f726801ce5ed067b8ee8eb523b8b99", "abbreviated_parent": "37445d7", "refs": "HEAD -> master, master/master", "encoding": "", "subject": "ue04-plots - A.3 fertig", "sanitized_subject_line": "ue04-plots-A.3-fertig", "body": "", "commit_notes": "", "verification_flag": "N", "signer": "", "signer_key": "", "author": { "name": "Lorenz Bauer", "email": "", "date": "Wed, 14 Dec 2022 00:50:33 +0100" }, "commiter": { "name": "Lorenz Bauer", "email": "", "date": "Wed, 14 Dec 2022 00:50:33 +0100" } }, { "commit": "37445d7254f726801ce5ed067b8ee8eb523b8b99", "abbreviated_commit": "37445d7", "tree": "caa7df1bd70b5fd2319e903331c2a96d80f08152", "abbreviated_tree": "caa7df1", "parent": "cb484ec66468c5bbac1f78a8ed87852202207701", "abbreviated_parent": "cb484ec", "refs": "", "encoding": "", "subject": "ue04-plots - arrows", "sanitized_subject_line": "ue04-plots-arrows", "body": "", "commit_notes": "", "verification_flag": "N", "signer": "", "signer_key": "", "author": { "name": "Lorenz Bauer", "email": "", "date": "Wed, 14 Dec 2022 00:48:45 +0100" }, "commiter": { "name": "Lorenz Bauer", "email": "", "date": "Wed, 14 Dec 2022 00:48:45 +0100" } }, { "commit": "cb484ec66468c5bbac1f78a8ed87852202207701", "abbreviated_commit": "cb484ec", "tree": "73e2e71396290d9627b9301451ca5a1bb7ba6df4", "abbreviated_tree": "73e2e71", "parent": "becd22ff715defbe00e064181ee71266e3d1db45", "abbreviated_parent": "becd22f", "refs": "", "encoding": "", "subject": "ue04-plots - titel", "sanitized_subject_line": "ue04-plots-titel", "body": "", "commit_notes": "", "verification_flag": "N", "signer": "", "signer_key": "", "author": { "name": "Lorenz Bauer", "email": "", "date": "Wed, 14 Dec 2022 00:33:59 +0100" }, "commiter": { "name": "Lorenz Bauer", "email": "", "date": "Wed, 14 Dec 2022 00:33:59 +0100" } },
What do I have to change to make this a valid JSON document, which json.loads() can process.
Advertisement
Answer
Looks like you are manipulating a git log
‘s output, making it a JSON file, then you’ll transfer it to some other JSON parser, and found an error there?
Yes, your output is not a valid JSON: As an “array”, a bracket wrapping the beginning and end are expected.
See https://stackoverflow.com/a/4600561/9035237 → https://gist.github.com/textarcana/1306223 for a post-processing example. All the code in your mentioned link said this too.
If you are using Python, you may:
output = "[" + output + "]" output = output.replace("},]", "}]")
However, there are still problems in your format: JSON doesn’t accept a line separator inside string, and a "
in any field will break the format forever, but these will probably happen in a commit message. So your format should change.
As per https://gist.github.com/varemenos/e95c2e098e657c7688fd?permalink_comment_id=3260906#gistcomment-3260906 says, you can do a hack: use some string that will probably not occur in any field, for example ^^^^
, as a temporary quote placeholder, then do any character escaping, for example n
→ \n
and "
→ \"
, and ^^^^
→ "
at last. Don’t do JSON prettify at this step, hand it up to a JSON formatter.