I tried to format git log to json but failed miserabely.
I used this command for the formatting, and I don’t think this is where my problem lies, but hey you never know.
These are my functions.
def call_git_log():
format: str = '{%n "commit": "%H",%n "abbreviated_commit": "%h",%n "tree": "%T",%n "abbreviated_tree": "%t",%n "parent": "%P",%n "abbreviated_parent": "%p",%n "refs": "%D",%n "encoding": "%e",%n "subject": "%s",%n "sanitized_subject_line": "%f",%n "body": "%b",%n "commit_notes": "%N",%n "verification_flag": "%G?",%n "signer": "%GS",%n "signer_key": "%GK",%n "author": {%n "name": "%aN",%n "email": "%aE",%n "date": "%aD"%n },%n "commiter": {%n "name": "%cN",%n "email": "%cE",%n "date": "%cD"%n }%n},'
output = subprocess.Popen(["git", "log", f"--pretty=format:{format}"], stdout=subprocess.PIPE, stderr=subprocess.PIPE)
stdout, stderr = output.communicate()
return stdout.decode("utf-8")
output = call_git_log()
with open("output/test.json", "w") as file:
print(str(output), file=file)
As a result I get this file – in the wrong JSON Format. Why is this and what is wrong. output/test.json
{
"commit": "4099117e564e7106b7ee7e315e3e8b8458a8fdce",
"abbreviated_commit": "4099117",
"tree": "6b1eb2fbf81de876d14781ffa82b5ee5db973af6",
"abbreviated_tree": "6b1eb2f",
"parent": "37445d7254f726801ce5ed067b8ee8eb523b8b99",
"abbreviated_parent": "37445d7",
"refs": "HEAD -> master, master/master",
"encoding": "",
"subject": "ue04-plots - A.3 fertig",
"sanitized_subject_line": "ue04-plots-A.3-fertig",
"body": "",
"commit_notes": "",
"verification_flag": "N",
"signer": "",
"signer_key": "",
"author": {
"name": "Lorenz Bauer",
"email": "",
"date": "Wed, 14 Dec 2022 00:50:33 +0100"
},
"commiter": {
"name": "Lorenz Bauer",
"email": "",
"date": "Wed, 14 Dec 2022 00:50:33 +0100"
}
},
{
"commit": "37445d7254f726801ce5ed067b8ee8eb523b8b99",
"abbreviated_commit": "37445d7",
"tree": "caa7df1bd70b5fd2319e903331c2a96d80f08152",
"abbreviated_tree": "caa7df1",
"parent": "cb484ec66468c5bbac1f78a8ed87852202207701",
"abbreviated_parent": "cb484ec",
"refs": "",
"encoding": "",
"subject": "ue04-plots - arrows",
"sanitized_subject_line": "ue04-plots-arrows",
"body": "",
"commit_notes": "",
"verification_flag": "N",
"signer": "",
"signer_key": "",
"author": {
"name": "Lorenz Bauer",
"email": "",
"date": "Wed, 14 Dec 2022 00:48:45 +0100"
},
"commiter": {
"name": "Lorenz Bauer",
"email": "",
"date": "Wed, 14 Dec 2022 00:48:45 +0100"
}
},
{
"commit": "cb484ec66468c5bbac1f78a8ed87852202207701",
"abbreviated_commit": "cb484ec",
"tree": "73e2e71396290d9627b9301451ca5a1bb7ba6df4",
"abbreviated_tree": "73e2e71",
"parent": "becd22ff715defbe00e064181ee71266e3d1db45",
"abbreviated_parent": "becd22f",
"refs": "",
"encoding": "",
"subject": "ue04-plots - titel",
"sanitized_subject_line": "ue04-plots-titel",
"body": "",
"commit_notes": "",
"verification_flag": "N",
"signer": "",
"signer_key": "",
"author": {
"name": "Lorenz Bauer",
"email": "",
"date": "Wed, 14 Dec 2022 00:33:59 +0100"
},
"commiter": {
"name": "Lorenz Bauer",
"email": "",
"date": "Wed, 14 Dec 2022 00:33:59 +0100"
}
},
What do I have to change to make this a valid JSON document, which json.loads() can process.
Advertisement
Answer
Looks like you are manipulating a git log
‘s output, making it a JSON file, then you’ll transfer it to some other JSON parser, and found an error there?
Yes, your output is not a valid JSON: As an “array”, a bracket wrapping the beginning and end are expected.
See https://stackoverflow.com/a/4600561/9035237 → https://gist.github.com/textarcana/1306223 for a post-processing example. All the code in your mentioned link said this too.
If you are using Python, you may:
output = "[" + output + "]"
output = output.replace("},]", "}]")
However, there are still problems in your format: JSON doesn’t accept a line separator inside string, and a "
in any field will break the format forever, but these will probably happen in a commit message. So your format should change.
As per https://gist.github.com/varemenos/e95c2e098e657c7688fd?permalink_comment_id=3260906#gistcomment-3260906 says, you can do a hack: use some string that will probably not occur in any field, for example ^^^^
, as a temporary quote placeholder, then do any character escaping, for example n
→ \n
and "
→ \"
, and ^^^^
→ "
at last. Don’t do JSON prettify at this step, hand it up to a JSON formatter.