I have some highly nested JSON files I need to work with.
A short example:
{ "coffee":[ { "value":"coffee" }, { "value":"water" } ], "cake":{ "value":{ "dough":[ { "value":"2", "name":"eggs" }, { "value":"500g", "name":"flour" }, { "value":{ "almondpaste":[ { "value":"300g", "name":"almonds" }, { "value":"200g", "name":"oil" }, { "value":"200g", "name":"sugar" }, ] }, { "value":"200g", "name":"sugar" }, . . . . . .
I would now like to read all names from the JSON file and write them into a list. This is not particularly difficult if the JSON file has a fixed structure. However, my JSON files have a variable structure and variable depth. Sometimes everything happens on one level, but there are also files that go up to level 4 or 5. I would now like to create a variable solution that iterates over all layers of the JSON and searches for certain keys.
I have already tried something in the following direction, but I always get error messages.
list = [] for k for val in json_file for d in val for j in d.keys(): if k== "name": list.append(k['name']) if d=="name": list.append(k['name']) if j=="name": list.append(k['name']) print(list)
Error:
for k for val in json_file for d in val for j in d.keys(): ^ SyntaxError: invalid syntax
Maybe someone has a code sample that could solve my problem and from which I could develop an idea for myself?
Advertisement
Answer
You can define this function:
def iterate(data): if isinstance(data, list): for item in data: yield from iterate(item) elif isinstance(data, dict): for key, item in data.items(): if key == 'name': yield item else: yield from iterate(item)
And then you can use it like this (data
is your json data):
result = list(iterate(data))
Let’s do an example. This is your input data
:
>>> data {'coffee': [{'value': 'coffee'}, {'value': 'water'}], 'cake': {'value': {'dough': [{'value': '2', 'name': 'eggs'}, {'value': '500g', 'name': 'flour'}, {'value': {'almondpaste': [{'value': '300g', 'name': 'almonds'}, {'value': '200g', 'name': 'oil'}, {'value': '200g', 'name': 'sugar'}]}}, {'value': '200g', 'name': 'sugar'}]}}}
Here is the output:
>>> list(iterate(data)) ['eggs', 'flour', 'almonds', 'oil', 'sugar', 'sugar']