My data is as following (this just extract but there are much more objects, some don’t have the additionalData
)
JavaScript
x
56
56
1
{
2
"referenceSetCount":1,
3
"totalRowCount":4,
4
"referenceSets":[
5
{
6
"name":"table",
7
"rowCount":4,
8
"_links":{
9
"self":{
10
"href":"link"
11
}
12
},
13
"referenceDataItems":[
14
{
15
"col1":"5524",
16
"col2":"yyy",
17
"col3":1,
18
"additionalData":[
19
{
20
"col1":111,
21
"col2":"xxxx",
22
"col3":1,
23
"col4":"18"
24
},
25
{
26
"col1":222,
27
"col2":"2222",
28
"col3":1,
29
"col4":"1"
30
}
31
]
32
},
33
{
34
"col1":"26434",
35
"col2":"dfdshere",
36
"col3":2,
37
"additionalData":[
38
{
39
"col1":34522,
40
"col2":"fsfs",
41
"col3":2,
42
"col4":"18"
43
},
44
{
45
"col1":5444,
46
"col2":"gregrege",
47
"col3":2,
48
"col4":"2"
49
}
50
]
51
}
52
]
53
}
54
]
55
}
56
I’m trying to iterate with list comprehension to get dataframe of referenceDataItems
and everything within that key, also additionalData
if appears.
JavaScript
1
12
12
1
import os
2
import sys
3
import pandas as pd
4
import urllib.request, json
5
6
api_url = urllib.request.urlopen("link_to_my_data")
7
8
api_data = json.loads(api_url.read())
9
10
#nest loop to get referenceSets + nested additionalData
11
data_alt = [v for k, v in api_data.items() if k == 'referenceSets']
12
Expected result:
JavaScript
1
4
1
col1 col2 col3 col1 col2 col3 col4 col1 col2 col3 col4
2
5524 yyy 1 111 xxxx 1 18 222 2222 1 1
3
26434 dfdshere 2 34522 fsfs 2 18 5444 gregrege 2 2
4
Advertisement
Answer
I did some research and this almost got my desired data, needs little modification in COLUMNS_TO_DROP
JavaScript
1
16
16
1
COLUMNS_TO_DROP = ["additionalData"]
2
3
def expand_additional_data(items):
4
for item in items:
5
for av in item.get("additionalData", []):
6
item[av["col2a"]] = av["col4a"]
7
yield item
8
9
10
for ref_set in data["referenceSets"]:
11
table_name = ref_set["name"]
12
expanded = expand_additional_data(ref_set["referenceDataItems"])
13
df = pd.DataFrame(expanded)
14
df = df.drop(COLUMNS_TO_DROP, axis=1, errors="ignore")
15
print(df)
16