I have a nested dictionary as below:
stud_data_dict = { 's1' : {'Course 1' : {'Course Name':'Maths', 'Marks':95, 'Grade': 'A+'}, 'Course 2' : {'Course Name':'Science', 'Marks': 75, 'Grade': 'B-'}}, 's2' : {'Course 1' : {'Course Name':'English', 'Marks': 82, 'Grade': 'B'}, 'Course 2' : {'Course Name':'Maths', 'Marks': 90, 'Grade': 'A'}}}
I need to convert it into a dataframe like below
Student Course 1 Course 2 Course Name Marks Grade Course Name Marks Grade s1 Maths 95 A+ Science 75 B- s2 English 82 B Maths 90 A
I have tired the following code from this answer
stud_df = pandas.DataFrame.from_dict(stud_data_dict, orient="index").stack().to_frame() final_df = pandas.DataFrame(stud_df[0].values.tolist(), index=stud_df.index)
I am getting the dataframe like below
Course Name Marks Grade s1 Course 1 Maths 95 A+ Course 2 Science 75 B- s2 Course 1 English 82 B Course 2 Maths 90 A
This is the closest I got to the desired output. What changes do I need to make to the code to get the desired dataframe?
Advertisement
Answer
Change dictionary
first and then pass to Series
with reshape by Series.unstack
:
#reformat nested dict #https://stackoverflow.com/a/39807565/2901002 d = {(level1_key, level2_key, level3_key): values for level1_key, level2_dict in stud_data_dict.items() for level2_key, level3_dict in level2_dict.items() for level3_key, values in level3_dict.items()} stud_df = pd.Series(d).unstack([1,2]) print (stud_df) Course 1 Course 2 Course Name Marks Grade Course Name Marks Grade s1 Maths 95 A+ Science 75 B- s2 English 82 B Maths 90 A
Another idea is created dictionary of tuples in keys with defaultdict
:
from collections import defaultdict d = defaultdict(dict) for k, v in stud_data_dict.items(): for k1, v1 in v.items(): for k2, v2 in v1.items(): d[(k1, k2)].update({k: v2}) df = pd.DataFrame(d) print(df) Course 1 Course 2 Course Name Marks Grade Course Name Marks Grade s1 Maths 95 A+ Science 75 B- s2 English 82 B Maths 90 A