I’m looking to get list of all possible json paths in a json file – can recommend any one?
Eg : if input is below
JavaScript
x
22
22
1
{
2
"_id":{
3
"$oid":""
4
},
5
"aa":false,
6
"bb":false,
7
"source":"",
8
"email":"",
9
"createdAt":{
10
"$date":""
11
},
12
"updatedAt":{
13
"$date":""
14
},
15
"cc":"",
16
"vv":"",
17
"metadata":{
18
"vv":"",
19
"xx":[{}]
20
}
21
}
22
o/p :
JavaScript
1
18
18
1
obj
2
obj._id
3
obj._id.$oid
4
obj.aa
5
obj.bb
6
obj.source
7
obj.email
8
obj.createdAt
9
obj.createdAt.$date
10
obj.updatedAt
11
obj.updatedAt.$date
12
obj.cc
13
obj.vv
14
obj.metadata
15
obj.metadata.vv
16
obj.metadata.xx
17
obj.metadata.xx[0]
18
I’m basically looking. a python version of this : https://www.convertjson.com/json-path-list.htm
I want to build a general solution , if any json file – it will be a single value for schema generation (ie one line in a newline delimeted json) Any suggestions ?
Advertisement
Answer
You can do this in a reasonably succinct way with a recursive generator. The string "obj"
is a little awkward since it doesn’t occur in the data structure. On the other hand, adding it at the end is simple:
JavaScript
1
13
13
1
def get_paths(d):
2
if isinstance(d, dict):
3
for key, value in d.items():
4
yield f'.{key}'
5
yield from (f'.{key}{p}' for p in get_paths(value))
6
7
elif isinstance(d, list):
8
for i, value in enumerate(d):
9
yield f'[{i}]'
10
yield from (f'[{i}]{p}' for p in get_paths(value))
11
12
paths = ['obj'+s for s in get_paths(d)]
13
Gives you paths as a list of strings:
JavaScript
1
17
17
1
['obj._id',
2
'obj._id.$oid',
3
'obj.aa',
4
'obj.bb',
5
'obj.source',
6
'obj.email',
7
'obj.createdAt',
8
'obj.createdAt.$date',
9
'obj.updatedAt',
10
'obj.updatedAt.$date',
11
'obj.cc',
12
'obj.vv',
13
'obj.metadata',
14
'obj.metadata.vv',
15
'obj.metadata.xx',
16
'obj.metadata.xx[0]']
17
Of course, you can wrap that last step in a function like and accept a root object string:
JavaScript
1
17
17
1
def get_paths(d, root="obj"):
2
def recur(d):
3
if isinstance(d, dict):
4
for key, value in d.items():
5
yield f'.{key}'
6
yield from (f'.{key}{p}' for p in get_paths(value))
7
8
elif isinstance(d, list):
9
for i, value in enumerate(d):
10
yield f'[{i}]'
11
yield from (f'[{i}]{p}' for p in get_paths(value))
12
13
return (root + p for p in recur(d))
14
15
list(get_paths(d))
16
# same result
17