Skip to content
Advertisement

Python pandas extract data from nested list

For a personal project, I’m calling data from the Google Books API and uploading the fields I want to mySQL.

I’ve successfully made the API request and received data. The data received is nested and i want to place this in a dataframe. Now my code works with the “first” column (no indent). All data after, I cannot enter this in a dataframe.

A small part of the API output:

    {
  "kind": "books#volumes",
  "totalItems": 1,
  "items": [
    {
      "kind": "books#volume",
      "id": "vH_vwQEACAAJ",
      "etag": "Gzpw8EnKpVY",
      "selfLink": "https://www.googleapis.com/books/v1/volumes/vH_vwQEACAAJ",
      "volumeInfo": {
        "title": "Crash course programmeren in Python",
        "subtitle": "projectgericht leren programmeren",

Bellow is my python code:

import requests
import pandas as pd
import json

request_string = "https://www.googleapis.com/books/v1/volumes?q=isbn:9789059056749&key=###KEY###"
response = requests.get(request_string).text
response_json = json.loads(response)
response_json_normalized = pd.json_normalize(response_json)

author_df = pd.DataFrame(data=response_json_normalized, columns=['title'])
print(author_df)

When using columns=['totalItems'] the value 1 is in the Dataframe, but when using columns=['title'] it says NaN. The same goes for id for example.

Am i missing something simple? Or is my aproach wrong?

The goal is to eventually enter the data in a mySQL DB, to catalogue all the books I have. Any advice on future steps is appreciated!

Advertisement

Answer

This is working for me.

import requests
import pandas as pd
import json
from pandas import json_normalize

request_string = "https://www.googleapis.com/books/v1/volumes?q=isbn:9789059056749&key=###KEY###"
response = requests.get(request_string).text
response_json = json.loads(response)
recs = response_json['items']
df = json_normalize(recs)



    
User contributions licensed under: CC BY-SA
7 People found this is helpful
Advertisement