Straightforward question: I would like to retrieve all the nodes connected to a given node within a NetworkX graph in order to create a subgraph. In the example shown below, I just want to extract all the nodes inside the circle, given the name of one of any one of them.
I’ve tried the following recursive function, but hit Python’s recursion limit, even though there are only 91 nodes in this network.
Regardless of whether or not the below code is buggy, what is the best way to do what I’m trying to achieve? I will be running this code on graphs of various sizes, and will not know beforehand what the maximum recursion depth will be.
def fetch_connected_nodes(node, neighbors_list): for neighbor in assembly.neighbors(node): print(neighbor) if len(assembly.neighbors(neighbor)) == 1: neighbors_list.append(neighbor) return neighbors_list else: neighbors_list.append(neighbor) fetch_connected_nodes(neighbor, neighbors_list) neighbors = [] starting_node = 'NODE_1_length_6578_cov_450.665_ID_16281' connected_nodes = fetch_connected_nodes(starting_node, neighbors)
Advertisement
Answer
Assuming the graph is undirected, there is a built-in networkx command for this:
node_connected_component(G, n)
The documentation is here. It returns all nodes in the connected component of G
containing n
.
It’s not recursive, but I don’t think you actually need or even want that.
comments on your code: You’ve got a bug that will often result an infinite recursion. If u
and v
are neighbors both with degree at least 2, then it will start with u
, put v
in the list and when processing v
put u
in the list and keep repeating. It needs to change to only process neighbors that are not in neighbors_list
. It’s expensive to check that, so instead use a set. There’s also a small problem if the starting node has degree 1. Your test for degree 1 doesn’t do what you’re after. If the initial node has degree 1, but its neighbor has higher degree it won’t find the neighbor’s neighbors.
Here’s a modification of your code:
def fetch_connected_nodes(G, node, seen = None): if seen == None: seen = set([node]) for neighbor in G.neighbors(node): print(neighbor) if neighbor not in seen: seen.add(neighbor) fetch_connected_nodes(G, neighbor, seen) return seen
You call this like fetch_connected_nodes(assembly, starting_node)
.