Skip to content
Advertisement

Convert nested dictionary to pandas dataframe

I have a nested dictionary as below:

stud_data_dict = { 's1' : {'Course 1' : {'Course Name':'Maths', 
                                         'Marks':95,
                                         'Grade': 'A+'},
                           'Course 2' : {'Course Name':'Science', 
                                         'Marks': 75,
                                         'Grade': 'B-'}},
                   's2' : {'Course 1' : {'Course Name':'English', 
                                         'Marks': 82,
                                         'Grade': 'B'},
                           'Course 2' : {'Course Name':'Maths', 
                                         'Marks': 90,
                                         'Grade': 'A'}}}

I need to convert it into a dataframe like below

  Student   Course 1                   Course 2
            Course Name  Marks  Grade  Course Name  Marks  Grade
    s1      Maths        95     A+     Science      75     B-
    s2      English      82     B      Maths        90     A

I have tired the following code from this answer

stud_df = pandas.DataFrame.from_dict(stud_data_dict, orient="index").stack().to_frame()
final_df = pandas.DataFrame(stud_df[0].values.tolist(), index=stud_df.index)

I am getting the dataframe like below

              Course Name  Marks  Grade             
s1  Course 1  Maths        95     A+
    Course 2  Science      75     B-
s2  Course 1  English      82     B
    Course 2  Maths        90     A

This is the closest I got to the desired output. What changes do I need to make to the code to get the desired dataframe?

Advertisement

Answer

Change dictionary first and then pass to Series with reshape by Series.unstack:

#reformat nested dict
#https://stackoverflow.com/a/39807565/2901002
d  = {(level1_key, level2_key, level3_key): values
       for level1_key, level2_dict in stud_data_dict.items()
       for level2_key, level3_dict in level2_dict.items()
       for level3_key, values      in level3_dict.items()}

stud_df = pd.Series(d).unstack([1,2])
print (stud_df)
    Course 1                Course 2            
   Course Name Marks Grade Course Name Marks Grade
s1       Maths    95    A+     Science    75    B-
s2     English    82     B       Maths    90     A

Another idea is created dictionary of tuples in keys with defaultdict:

from collections import defaultdict

d = defaultdict(dict)

for k, v in stud_data_dict.items():
    for k1, v1 in v.items():
        for k2, v2 in v1.items():
            d[(k1, k2)].update({k: v2})

df = pd.DataFrame(d)
print(df)
      Course 1                Course 2            
   Course Name Marks Grade Course Name Marks Grade
s1       Maths    95    A+     Science    75    B-
s2     English    82     B       Maths    90     A
User contributions licensed under: CC BY-SA
5 People found this is helpful
Advertisement