I am trying to build a tree like hierarchy from modules and their respective submodules, each module has a name, data, and a list of its submodules,
@dataclass class module(object): name: str data: int submodules: list = field(default_factory=list) def main(): module1 = module("Module1", 1, ["Module2", "Module3"]) module2 = module("Module2", 2, ["Module4"]) module3 = module("Module3", 3, ["Module6", "Module7"]) module4 = module("Module4", 4, ["Module5"]) module5 = module("Module5", 5, ) module6 = module("Module6", 6, ) module7 = module("Module7", 7, ) module_list = [module4, module2, module6, module1, module3, module5, module7] #This is just Example Data for mod in module_list: for item in module_list: if mod.name in item.submodules: print(mod.name + " is submodule of " + item.name) if __name__ == '__main__': main()
With this code I’m able to detect if something is a submodule of something else but I don’t know how I am supposed to form the connections/build the data structure.
My desired output should look something like this:
module1 ├── module2 │ └── module4 │ └── module5 └── module3 ├── module6 └── module7
I tried using Anytree and Treelib to achieve this for an arbitrary amount of modules but I could not come up with a working solution. Although this problem looks simple (might not be) I just can not figure it out.
Cheers!
Advertisement
Answer
Your data structure already represents a hierarchy, but you could also store the reference of a module’s parent, and have a container class Hierarchy
with which to facilitate access to functionality.
I would name the class Module
with a capital M
, and in submodules
store references to Module instances instead of strings. The container Hierarchy
class could act as a dictionary so to give access to Module instance by their name.
Here is a proposal:
class Module: def __init__(self, name: str, data: int, parent: 'Module' = None): self.name = name self.data = data self.parent = parent if parent: parent.submodules.append(self) self.submodules = [] def preorder(self, depth): yield self, depth for submodule in self.submodules: yield from submodule.preorder(depth + 1) @property def ancestors(self): if self.parent.parent: yield from self.parent.ancestors yield self.parent class Hierarchy(dict): def __init__(self): super() self._root = Module("_root", None, None) self.toplevel = self._root.submodules def addmodule(self, name, data, submodulenames=None): if name in self: self[name].data = data else: self[name] = Module(name, data, self._root) for subname in submodulenames or (): if subname not in self: self[subname] = Module(subname, None, self[name]) else: if self[subname].parent: self[subname].parent.submodules.remove(self[subname]) self[subname].parent = self[name] self[name].submodules.append(self[subname]) def __iter__(self): for module in self.toplevel: yield from module.preorder(0)
You could build your input much like you did before:
hierarchy = Hierarchy() hierarchy.addmodule("Module1", 1, ["Module2", "Module3"]) hierarchy.addmodule("Module2", 2, ["Module4"]) hierarchy.addmodule("Module3", 3, ["Module6", "Module7"]) hierarchy.addmodule("Module4", 4, ["Module5"]) hierarchy.addmodule("Module5", 5, ) hierarchy.addmodule("Module6", 6, ) hierarchy.addmodule("Module7", 7, )
Here are then some examples on how to use this hierarchy
:
print("Complete hierarchy:") for module, depth in hierarchy: print(" " * depth, module.name) print("Which is the parent of Module4?", hierarchy["Module4"].parent.name) print("Which are the submodules of Module4?", [m.name for m in hierarchy["Module4"].submodules]) print("Which are the top level modules?", [m.name for m in hierarchy.toplevel]) print("which are the ancestors of Module4?", [m.name for m in hierarchy["Module4"].ancestors])
This will output:
Complete hierarchy: Module1 Module2 Module4 Module5 Module3 Module6 Module7 which is the parent of Module4? Module2 which are the submodules of Module4? ['Module5'] which are the top level modules? ['Module1'] which are the ancestors of Module4? ['Module1', 'Module2']