python3 join lists that have same value in list of lists

Question

I have similar question to one that has been asked several years ago, the link is down here. the thing is that all answers are in python 2 and does work for me. my lists are huge so time is important. if anyone can solve that for python3, that will really help. Consider this list of lists: I want to

Accepted Answer

This can be seen as a graph problem in which you merge subgraphs and need to find the connected components.Here is your graph:networkxUsing networkx you can do:import networkx as nxfrom itertools import chain, pairwise# python < 3.10 use this recipe for pairwise instead# from itertools import tee# def pairwise(iterable):#     a, b = tee(iterable)#     next(b, None)#     return zip(a, b)G = nx.Graph()G = nx.from_edgelist(chain.from_iterable(pairwise(e) for e in l))G.add_nodes_from(set.union(*map(set, l))) # adding single itemslist(nx.connected_components(G))output:[{1, 2, 3, 4}, {5, 6, 7, 8, 9}]pythonNow, you can use pure python to perform the same thing, finding the connected components and merging them.An example code is nicely described in this post (archive.org link for long term).In summary, the first step is building the list of neighbors, then a recursive function is used to join the neighbors of neighbors keeping track of the already seen ones.from collections import defaultdict   #merge function to  merge all sublist having common elements. def merge_common(lists):     neigh = defaultdict(set)     visited = set()     for each in lists:         for item in each:             neigh[item].update(each)     def comp(node, neigh = neigh, visited = visited, vis = visited.add):         nodes = set([node])         next_node = nodes.pop         while nodes:             node = next_node()             vis(node)             nodes |= neigh[node] - visited             yield node     for node in neigh:         if node not in visited:             yield sorted(comp(node))example:merge_common(l)# [[1, 2, 3, 4], [5, 6, 7, 8, 9]]

Advertisement

Answer

networkx

python