How to combine a string of multiple json gz files in a list into one json gz file then open the file?

Question

I have a list of json gz https files FYI: these files are not real files due to privacy laws but mimic the exact structure. My goal is to combine all these json gz files into one large json gz file. I&#8217;ve tried numerous ways to do this by referencing other Stack Overflow questions; however, I am unable t…

Accepted Answer

This comment helped me somewhat, but in my situation, I believe that Ineed to add requests to get the file since it is an http.Indeed built-in open function does not support HTTP access, however in this case I would use urllib.request.urlopen, consider following example using example file provided by Mozillaimport jsonimport gzipimport urllib.requesturl = "https://wiki.mozilla.org/images/f/ff/Example.json.gz"with urllib.request.urlopen(url) as gzf:    with gzip.open(gzf) as jsonf:        data = json.load(jsonf)        print(data)gives output{'InstallTime': '1295768962', 'Comments': 'Will test without extension.', 'Theme': 'classic/1.0', 'Version': '4.0b10pre', 'id': 'ec8030f7-c20a-464f-9b0e-13a3a9e97384', 'Vendor': 'Mozilla', 'EMCheckCompatibility': 'false', 'Throttleable': '1', 'Email': 'deinspanjer@mozilla.com', 'URL': 'http://nighthacks.com/roller/jag/entry/the_shit_finally_hits_the', 'version': '4.0b10pre', 'CrashTime': '1295903735', 'ReleaseChannel': 'nightly', 'submitted_timestamp': '2011-01-24T13:15:48.550858', 'buildid': '20110121153230', 'timestamp': 1295903748.551002, 'Notes': 'Renderers: 0x22600,0x22600,0x20400', 'StartupTime': '1295768964', 'FramePoisonSize': '4096', 'FramePoisonBase': '7ffffffff0dea000', 'AdapterRendererIDs': '0x22600,0x22600,0x20400', 'Add-ons': 'compatibility@addons.mozilla.org:0.7,enter.selects@agadak.net:6,{d10d0bf8-f5b5-c8b4-a8b2-2b9879e08c5d}:1.3.3,sts-ui@sidstamm.com:0.1,masspasswordreset@johnathan.nightingale:1.04,support@lastpass.com:1.72.0,{972ce4c6-7e08-4474-a285-3208198ce6fd}:4.0b10pre', 'BuildID': '20110121153230', 'SecondsSinceLastCrash': '810473', 'ProductName': 'Firefox', 'legacy_processing': 0}Explanation: first with does open file under specified URL then gzip.open is used to decompress is, so json.load can be used to parse JSON and get data (data is dict). Note that all used imports pertain to standard library, so you do not need to install any external package.

Advertisement

Answer