Skip to content
Advertisement

How to convert a dataframe to nested json

I have this DataFrame:

df = pd.DataFrame({'Survey': "001_220816080015", 'BCD': "001_220816080015.bcd", 'Sections': "4700A1/305, 4700A1/312"})

All the dataframe fields are ASCII strings and is the output from a SQL query (pd.read_sql_query) so the line to create the dataframe above may not be quite right.

And I wish the final JSON output to be in the form

[{
  "Survey": "001_220816080015",
  "BCD": "001_220816080015.bcd",
  "Sections": [
    "4700A1/305", 
    "4700A1/312"
}]

I realize that may not be ‘normal’ JSON but that is the format expected by a program over which I have no control.

The nearest I have achieved so far is

[{
      "Survey": "001_220816080015",
      "BCD": "001_220816080015.bcd",
      "Sections": "4700A1/305, 4700A1/312"
    }]

Problem might be the structure of the dataframe but how to reformat it to produce the requirement is not clear to me.

The JSON line is:

df.to_json(orient='records', indent=2)

Advertisement

Answer

Isn’t the only thing you need to do to parse the Sections into a list?

import pandas as pd

df= pd.DataFrame({'Survey': "001_220816080015", 'BCD': "001_220816080015.bcd", 'Sections': "4700A1/305, 4700A1/312"}, index=[0])

df['Sections'] = df['Sections'].str.split(', ')
print(df.to_json(orient='records', indent=2))

[
  {
    "Survey":"001_220816080015",
    "BCD":"001_220816080015.bcd",
    "Sections":[
      "4700A1/305",
      "4700A1/312"
    ]
  }
]
User contributions licensed under: CC BY-SA
3 People found this is helpful
Advertisement