Skip to content
Advertisement

merge & write two jsonl (json lines) files into a new jsonl file in python3.6

Hello I have two jsonl files like so:

one.jsonl

JavaScript

second.jsonl

JavaScript

And my goal is to write a new jsonl file (with encoding preserved) name merged_file.jsonl which will look like this:

JavaScript

My approach is like this:

JavaScript

However I am met with this error: TypeError: Object of type generator is not JSON serializable I will apprecite your hint/help in any ways. Thank you! I have looked other SO repos, they are all writing normal json files, which should work in my case too, but its keep failing.

Reading single file like this works:

JavaScript

Advertisement

Answer

It is possible that extract_json returns a generator instead of a list/dict which is json serializable
since it is jsonl, which means each line is a valid json
so you just need to tweak your existing code a little bit.

JavaScript

Now that I think about it you didn’t even have to load it using json, except it will help you sanitize any badly formatted JSON lines is all

you could collect all the lines in one shot like this

JavaScript
User contributions licensed under: CC BY-SA
2 People found this is helpful
Advertisement