Skip to content
Advertisement

Construct graph connectivity matrices in COO format

I have faced the following subtask while working with graph data:

I need to construct graph connectivity matrices in COO format for graphs with several fully-connected components from arrays of “border” indices.

As an example, given array

JavaScript

the resulting COO matrix should be

JavaScript

That is, borders array contains ranges of nodes that should form fully-connected subgraphs (starting index included and ending index excluded).

I came up with the following algorithm, but I suspect that the perfomance could be improved:

JavaScript

I feel there may be a faster solution, maybe using some numpy vectorized operations.

Does anyone have any ideas?

Advertisement

Answer

Since your goal is a faster solution than what you have, you can explore itertools for solving this efficiently. This approach benchmarks approximately 25 times faster than your current approach as tested on larger border lists.

JavaScript
JavaScript

Alternate approaches and benchmarks:

Note: These have been benchmarked on your current small list as well as on a larger border list as generated by borders = np.arange(300)[np.random.randint(0,2,(300,),dtype=bool)]

Disjoint union of complete graphs

What you are trying to do is essentially combine disjoint complete graphs. Adjacency matrices for such graphs have complete connections for selective items along it’s diagonal. You can use networkx to solve these.

enter image description here

While slower than your current solution, you will find that working on these graphs objects would be much much easier and rewarding than using NumPy to represent graphs.

Approach 1:

  1. Assuming that nodes are in sequence, calculate the number of nodes in each subgraph as i
  2. Create a complete matrix filled with 1s of the shape i*i
  3. Combine the graphs using nx.disjoint_union_all
  4. Fetch the edges of this graph
JavaScript

Approach 2:

  1. Iterate over the rolling 2-gram tuples of the borders using zip
  2. Create nx.complete_graph using ranges(start, end) from these tuples
  3. Combine the graphs using nx.disjoint_union_all
  4. Fetch the edges of this graph
JavaScript

Output is a bit faster than previous but lacks the self-loops on the nodes which would have to be added separately

Using itertools.product

Approach 3:

  1. Iterate over the rolling 2-gram tuples of the borders using zip
  2. Use itertools.product completely connected edge list for each subgraph
  3. Use itertools.chain to “append” the two iterators
  4. Return them as edges
JavaScript

This approach is approximately 25 times faster than your current approach

Your current approach – benchmark

JavaScript
User contributions licensed under: CC BY-SA
2 People found this is helpful
Advertisement