I’m creating a tool for displaying Python project as an UML diagram (+ displaying some code error detection using GUI).
I scan some project using Pyreverse and I have all data I need for drawing UML diagram. The problem is positioning of the class boxes on the canvas.
For a start, I decided to use already implemented force based algorithm to decide about position of classes, it works quite well here’s the result https://github.com/jvorcak/gpylint/blob/master/screenshots/gpylint.png and here’s the code (Python, but it’s easy to understand even for non Python programmers).
There is one proble, it’s great for displaying graphs, but if I want to display UML I’d like to have some enhancements, for instance if 2 classes extend one superclass, I’d expect them to be at the same level in the graph like in graphs generated by dot program.
Advertisement
Answer
It seems that the main enhancement you are missing is transforming your graph to a layered graph. This is no easy task, but it’s doable. (the quality of the result may vary by the amount of time and thought invested in the process).
The main idea is to do a some kind of topological sorting on the graph to split it into layers, do some arrangements in it and then to draw the graph. (you can find python code to do a real topological sort online (example), but real TS will just produces a long line-like graph, and we want something a little different)
So I’ll try to describe an algorithm to transform a given graph into a layered one:
Topological sorting doesn’t work on graphs with cycles, so if the input graph is not already a directed graph with no cycles, you’ll have to find a set of edges that can be removed (or possibly reversed) to create an a cyclic graph (you will later add them to the layered graph, but that will brake the layering and make the graph less pretty :). Since Finding the smallest possible set of edges you can remove is NP-complete (very hard) – I think you’ll have to do some shortcuts here, and not necessarily find the minimal set of edges, but do it in reasonable time.
Brake the graph into layers, there are many optimizations that can be done here, but I suggest you to keep it simple. iterate over all the graph’s vertexes and each time collect all the vertexes with no incoming edges to a layer. This might produce a line-like graph in some simple cases, but it suits quite well in the case of UML graphs.
A good graph is one that has the smallest number of edges crossing each other, It doesn’t sound important but this fact contributes greatly to the overall look of the graph. what determines the number of crossings is the order of arrangement of the edges in every layer.But again, finding the minimum number of crossings or finding a maximum crossing-free set of edges is NP-complete :( “so again it is typical to resort to heuristics, such as placing each vertex at a position determined by finding the average or median of the positions of its neighbors on the previous level and then swapping adjacent pairs as long as that improves the number of crossings.”
The edges removed (or reversed) in the first step of the algorithm are returned to their original position.
And there you have it! a nice layered graph for your UML.
- If my explanation wasn’t clear enough try and read the Wikipedia article on Layered graph drawing again, or ask me any questions, and I’ll try to respond.
- Remember that this is an algorithm for the general case, and lots optimizations can be made to better handle your specific case.
- If you want more ideas for features for your UML tool, look at the wonderful work done by Jetbrains for their IntelliJ UML tool
Hope that my comments here are helpful in any way.
Important Update: since you stated that you are “Looking for an answer drawing from credible and/or official sources.” I attach This The formal documentation from graphviz (of dot’s algorithm) that “describe a four-pass algorithm for drawing directed graphs. The first pass finds an optimal rank assignment using a network simplex algorithm. The second pass sets the vertex order within ranks by an iterative heuristic incorporating a novel weight function and local transpositions to reduce crossings. The third pass finds optimal coordinates for nodes by constructing and ranking an auxiliary graph. The fourth pass makes splines to draw edges. The algorithm makes good drawings and runs fast.” http://www.graphviz.org/Documentation/TSE93.pdf