Skip to content
Advertisement

deceptively simple implementation of topological sorting in python

Extracted from here we got a minimal iterative dfs routine, i call it minimal because you can hardly simplify the code further:

JavaScript

Here’s my question, how could you transform this routine into a topological sort method where the routine also becomes “minimal”? I’ve watched this video and the idea is quite clever so I was wondering if it’d be possible to apply the same trick into the above code so the final result of topological_sort also becomes “minimal”.

Not asking for a version of topological sorting which is not a tiny modification of the above routine, i’ve already seen few of them. The question is not “how do i implement topological sorting in python” but instead, finding the smallest possible set of tweaks of the above code to become a topological_sort.

ADDITIONAL COMMENTS

In the original article the author says :

A while ago, I read a graph implementation by Guido van Rossen that was deceptively simple. Now, I insist on a pure python minimal system with the least complexity. The idea is to be able to explore the algorithm. Later, you can refine and optimize the code but you will probably want to do this in a compiled language.

The goal of this question is not optimizing iterative_dfs but instead coming up with a minimal version of topological_sort derived from it (just for the sake of learning more about graph theory algorithms). In fact, i guess a more general question could be something like given the set of minimal algorithms, {iterative_dfs, recursive_dfs, iterative_bfs, recursive_dfs}, what would be their topological_sort derivations? Although that would make the question more long/complex, so figuring out the topological_sort out of iterative_dfs is good enough.

Advertisement

Answer

It’s not easy to turn an iterative implementation of DFS into Topological sort, since the change that needs to be done is more natural with a recursive implementation. But you can still do it, it just requires that you implement your own stack.

First off, here’s a slightly improved version of your code (it’s much more efficient and not much more complicated):

JavaScript

Here’s how I’d modify that code to do a topological sort:

JavaScript

The part I commented with “new stuff here” is the part that figures out the order as you move up the stack. It checks if the new node that’s been found is a child of the previous node (which is on the top of the stack). If not, it pops the top of the stack and adds the value to order. While we’re doing the DFS, order will be in reverse topological order, starting from the last values. We reverse it at the end of the function, and concatenate it with the remaining values on the stack (which conveniently are already in the correct order).

Because this code needs to check v not in graph[stack[-1]] a bunch of times, it will be much more efficient if the values in the graph dictionary are sets, rather than lists. A graph usually doesn’t care about the order its edges are saved in, so making such a change shouldn’t cause problems with most other algorithms, though code that produces or updates the graph might need fixing. If you ever intend to extend your graph code to support weighted graphs, you’ll probably end up changing the lists to dictionaries mapping from node to weight anyway, and that would work just as well for this code (dictionary lookups are O(1) just like set lookups). Alternatively, we could build the sets we need ourselves, if graph can’t be modified directly.

For reference, here’s a recursive version of DFS, and a modification of it to do a topological sort. The modification needed is very small indeed:

JavaScript

That’s it! One line gets removed and a similar one gets added at a different location. If you care about performance, you should probably do result.append in the second helper function too, and do return result[::-1] in the top level recursive_topological_sort function. But using insert(0, ...) is a more minimal change.

Its also worth noting that if you want a topological order of the whole graph, you shouldn’t need to specify a starting node. Indeed, there may not be a single node that lets you traverse the entire graph, so you may need to do several traversals to get to everything. An easy way to make that happen in the iterative topological sort is to initialize q to list(graph) (a list of all the graph’s keys) instead of a list with only a single starting node. For the recursive version, replace the call to recursive_helper(node) with a loop that calls the helper function on every node in the graph if it’s not yet in seen.

User contributions licensed under: CC BY-SA
9 People found this is helpful
Advertisement