Skip to content
Advertisement

Creating a nested dictionary from a pandas dataframe

I have a dataframe which demonstrates a hierarchy of meters. A meter has an ID, and can have any number of children, this children can also have children, which can also have children, ad infinitum.

The dataframe has a meter per row, and the level of the child is shown by column. As shown below:

hierarchy table

The aim is to convert it to a nested dictionary in the following format:

{
    "meters": [
        {
            "meter_id": "a",
            "meter_children": [
                {
                    "meter_id": "b",
                    "meter_children": []
                },
                {
                    "meter_id": "c",
                    "meter_children": [
                        {
                            "meter_id": "d",
                            "meter_children": []
                        }
                    ]
                },
                {
                    "meter_id": "e",
                    "meter_children": []
                }
            ]
        },
        {
            "meter_id": "f",
            "meter_children": []
        },
        {
            "meter_id": "g",
            "meter_children": []
        },
        {
            "meter_id": "h",
            "meter_children": []
        },
        {
            "meter_id": "i",
            "meter_children": []
        },
        {
            "meter_id": "j",
            "meter_children": []
        },
        {
            "meter_id": "k",
            "meter_children": []
        },
        {
            "meter_id": "l",
            "meter_children": [
                {
                    "meter_id": "m",
                    "meter_children": []
                },
                {
                    "meter_id": "n",
                    "meter_children": []
                },
                {
                    "meter_id": "o",
                    "meter_children": []
                }
            ]
        },
        {
            "meter_id": "p",
            "meter_children": []
        },
        {
            "meter_id": "q",
            "meter_children": []
        },
        {
            "meter_id": "r",
            "meter_children": []
        },
        {
            "meter_id": "s",
            "meter_children": []
        },
        {
            "meter_id": "t",
            "meter_children": []
        },
        {
            "meter_id": "u",
            "meter_children": []
        }
    ]
}

I have managed to achieve this, using the scary code you can see below (sorry). I was wondering if there is a tool that can do this for you, or if there is a cleaner, more readable way of accomplishing this.

Note this only goes up to a nesting level of 4, but can be easily extended further.

results = {}
list_0 = []

for row in df.values:
    
    counter = 0
    
    for entry in row:
        
        if entry==entry:
            
            entry=str(entry)
        
            if counter==0:
                
                list_0.append({
                    "meter_id":entry,
                    "meter_children":[]
                })
                meter_0 = entry
                
                list_1 = []
                
            if counter==1:
                            
                for item in list_0:
                    
                    if meter_0 in item.values():
                        
                        list_1.append({
                            "meter_id":entry,
                            "meter_children":[]
                        })
                        item["meter_children"]=list_1
    
                        meter_1=entry
                        
            
                list_2=[]
                
            if counter==2:
                
                for item in list_0:
                    
                    if meter_0 in item.values():
                        
                        for item in list_1:
                            
                            if meter_1 in item.values():
                                
                                list_2.append({
                                    "meter_id":entry,
                                    "meter_children":[]
                                })
                                item["meter_children"]=list_2
                                
                                meter_3=entry
                                 
                list_3=[]
                                    
            if counter==3:
                
                for item in list_0:
                    
                    if meter_0 in item.values():
                        
                        for item in list_1:
                            
                            if meter_1 in item.values():
                                
                                for item in list_2:
                                    
                                    if meter_2 in item.values():
                                        
                                        list_3.append({
                                            "meter_id":entry,
                                            "meter_children":[]
                                        })
                                        item["meter_children"]=list_3

                                        meter_4=entry
                                        
                list_4=[]
                
        counter+=1
                
results["meters"] = list_0

Advertisement

Answer

You can use itertools.groupby with recursion:

from itertools import groupby as gb
d = [['a', None, None, None, None, None, None, None], [None, 'b', None, None, None, None, None, None], [None, 'c', None, None, None, None, None, None], [None, None, 'd', None, None, None, None, None], [None, 'e', None, None, None, None, None, None], ['f', None, None, None, None, None, None, None], ['g', None, None, None, None, None, None, None], ['h', None, None, None, None, None, None, None], ['i', None, None, None, None, None, None, None], ['j', None, None, None, None, None, None, None], ['k', None, None, None, None, None, None, None], ['l', None, None, None, None, None, None, None], [None, 'm', None, None, None, None, None, None], [None, 'n', None, None, None, None, None, None], [None, 'o', None, None, None, None, None, None], ['p', None, None, None, None, None, None, None], ['q', None, None, None, None, None, None, None], ['r', None, None, None, None, None, None, None], ['s', None, None, None, None, None, None, None], ['t', None, None, None, None, None, None, None], ['u', None, None, None, None, None, None, None]]
def get_tree(d):
   r = []
   for a, b in gb(d, key=lambda x:x[0] is not None):
     if a:
        r.extend([{"meter_id":j, "meter_children":[]} for j, *_ in b])
     else:
        r[-1]['meter_children'] = get_tree([j for _, *j in b])
   return r       

import json
print(json.dumps({'meters':get_tree(d)}, indent=4))

Output:

{
    "meters": [
       {
           "meter_id": "a",
            "meter_children": [
               {
                   "meter_id": "b",
                   "meter_children": []
              },
              {
                  "meter_id": "c",
                  "meter_children": [
                      {
                          "meter_id": "d",
                          "meter_children": []
                      }
                   ]
              },
              {
                  "meter_id": "e",
                  "meter_children": []
              }
           ]
       },
       {
           "meter_id": "f",
           "meter_children": []
       },
       {
           "meter_id": "g",
           "meter_children": []
       },
       {
           "meter_id": "h",
           "meter_children": []
       },
       {
           "meter_id": "i",
           "meter_children": []
       },
       {
           "meter_id": "j",
           "meter_children": []
       },
       {
           "meter_id": "k",
           "meter_children": []
       },
       {
           "meter_id": "l",
           "meter_children": [
               {
                   "meter_id": "m",
                   "meter_children": []
               },
               {
                   "meter_id": "n",
                   "meter_children": []
               },
               {
                   "meter_id": "o",
                   "meter_children": []
               }
           ]
        },
        {
            "meter_id": "p",
            "meter_children": []
        },
        {
            "meter_id": "q",
            "meter_children": []
       },
       {
            "meter_id": "r",
            "meter_children": []
       },
       {
            "meter_id": "s",
            "meter_children": []
       },
       {
            "meter_id": "t",
            "meter_children": []
       },
       {
            "meter_id": "u",
            "meter_children": []
       }
    ] 
}
User contributions licensed under: CC BY-SA
7 People found this is helpful
Advertisement