I am trying build a tree (dictionary ) from input numbers. My main criteria for the tree is all the nodes in a level are in a same range. ( a series of data from parent) I think you will get idea from my examples-
Please see the example
input_list=[1,2,3,4,5, 5.1,5.2,5.3,5.4, 6,7,8,8.1,8.2,8.21, 8.22,8.22,8.23,8.24,8.25, 8.3,8.4,8.5, 9,9.1,9.2,9.3,10,11,12] expected_out_put={ 0: [1,2,3,4,5,6,7,8,9,10,11,12], 5: [5.1,5.2,5.3,5.4], 8: [8.1,8.2,8.3,8.4,8.5], 8.2: [8.21,8.22,8.23,8.24,8.25], 9: [9.1,9.2,9.3], }
I tried to build a tree initially. This will work only for the data 1-9, anybody help me to find a better solution This is how I tried
class Section: FIRST_TIME=False def intialise(self): self.rank=0 self.child_sections.clear() self.next_value_diff=10000 parent=None def __init__(self,**Kwargs): self.rank = Kwargs['rank'] self.child_sections=[] self.next_value_diff=0 def set_next_value_diff(self,value): self.next_value_diff=value def is_sibling(self,section_obj): c_diff=abs(section_obj.rank-self.rank) print('c_difference is '+str(c_diff)+"and prev differnec is"+str(self.next_value_diff)) if(self.next_value_diff>c_diff): return False return True def add_child(self,section_obj): prev_root=self print("Self rank is"+str(self.rank)) print(len(self.child_sections)) print([i.rank for i in self.child_sections]) if(len(self.child_sections)>0): prev_root=self.child_sections[-1] c_diff=abs(section_obj.rank-prev_root.rank) print("addred rank "+str(section_obj.rank)+" prev "+str(prev_root.rank)+"diffrenc "+str(c_diff)) section_obj.next_value_diff=c_diff self.child_sections.append(section_obj) print("added "+str(section_obj.rank)+"under "+str(self.rank)+"diffrenc "+str(c_diff)) return data_stack=[1.0, 2.0, 2.1, 2.2, 2.3, 2.4, 2.5, 2.6, 2.7, 2.8, 2.9, 2.1, 2.11, 2.12, 2.13, 2.14, 2.15, 2.16, 2.17, 2.18, 2.19, 2.2, 2.21, 2.22, 2.23, 2.24, 2.25, 2.26, 2.27, 2.28, 2.34, 2.35, 2.36, 2.37, 2.38, 2.39, 2.4, 2.42, 2.43, 2.44, 2.45, 2.46] data_stack=[i*1000 for i in data_stack] data_stack.reverse() print(data_stack) def create_root(parent): print('Current parent is'+str(parent.rank)) global data_stack while( (len(data_stack)>0) and (not parent.is_sibling(Section(rank=data_stack[-1])))): rank_value=data_stack.pop() print("poped value of "+str(rank_value)) section_obj=Section(rank=rank_value) if(len(parent.child_sections)>0): print("Found its "+str(parent.rank)+"has "+str(len(parent.child_sections))+" childrens") #check any child exists #sub_section is a sibling child=parent.child_sections[-1] print("checking simplinag "+str(section_obj.rank)+" : aginst child "+str(child.rank)) if child.is_sibling(section_obj): print("yes he is simpler with "+str(rank_value)+" : aginst child "+str(child.rank)) parent.add_child(section_obj) else: print('not sibler') nextparent=parent.child_sections.pop() print('popped '+str(nextparent.rank)) print("Koids") print([i.rank for i in nextparent.child_sections]) nextparent.add_child(section_obj) print('recurent call happenmd') parent.add_child(create_root(nextparent)) else: print("No child at the time of"+str(rank_value)) parent.add_child(section_obj) print("added to parent at the time of"+str(rank_value)) return parent parent=Section(rank=0) parent.intialise() parent=create_root(parent) print(parent.rank) print([j.rank for j in parent.child_sections]) for k in parent.child_sections: print('kids of'+str(k.rank)) print([m.rank for m in k.child_sections]) for m in k.child_sections: print('kids of'+str(m.rank)) print([n.rank for n in m.child_sections]) for n in m.child_sections: print('kids of'+str(n.rank)) print([s.rank for s in n.child_sections])
Advertisement
Answer
Here’s an attempt:
Imports: I’m working with defaultdict
and groupby
, both from the standard library:
from collections import defaultdict from itertools import groupby
Two helper functions for later:
def tuple_to_key(tup): if len(tup) == 2: return int(tup[0]) return float(f"{tup[0]}.{''.join(n for n in tup[1:-1])}") def childs_to_float(childs): return [float(f"{tup[0]}.{''.join(n for n in tup[1:])}") for tup in childs]
Step 1: Establishing the levels (the assumption is that input_list
is sorted):
levels = defaultdict(list) for tup in [(lst[0],) if len(lst) == 1 else (lst[0],) + tuple(lst[1]) for lst in [str(n).split('.') for n in input_list]]: levels[len(tup)].append(tup)
The result looks like:
{ 1: [('1',), ('2',), ('3',), ('4',), ('5',), ('6',), ('7',), ('8',), ('9',), ('10',), ('11',), ('12',)], 2: [('5', '1'), ('5', '2'), ('5', '3'), ('5', '4'), ('8', '1'), ('8', '2'), ('8', '3'), ('8', '4'), ('8', '5'), ('9', '1'), ('9', '2'), ('9', '3')], 3: [('8', '2', '1'), ('8', '2', '2'), ('8', '2', '2'), ('8', '2', '3'), ('8', '2', '4'), ('8', '2', '5')] }
Step 2: Finding the parent child relationships:
output = { parent: childs_to_float(childs) for level in levels.keys() if level > 1 for parent, childs in groupby(levels[level], key=tuple_to_key) } if 1 in levels: childs = [int(t[0]) for t in levels[1]] output[min(childs) - 1] = childs
Output:
{ 0: [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12], 5: [5.1, 5.2, 5.3, 5.4], 8: [8.1, 8.2, 8.3, 8.4, 8.5], 8.2: [8.21, 8.22, 8.22, 8.23, 8.24, 8.25], 9: [9.1, 9.2, 9.3] }
I’m sure there are plenty of wrinkles in it …