i have a python list of lists i want to merge all the containing list with at least 1 common element and remove the similar items
i have a big set of data which is a list of lists, with some common data in some of the containing lists, i want to merge all lists with common data
# sample data foo = [ [0,1,2,6,9], [0,1,2,6,5], [3,4,7,3,2], [12,36,28,73], [537], [78,90,34,72,0], [573,73], [99], [41,44,79], ] # i want to get this [ [0,1,2,6,9,5,3,4,7,3,2,78,90,34,72,0], [12,36,28,73,573,73,573], [99], [41,44,79], ]
the elements containing even one common element they are grouped together
the original data file is this
Edit
this is what i am trying
import json data = json.load(open('x.json')) # https://files.catbox.moe/y1yt5w.json class Relations: def __init__(self): pass def process_relation(self, flat_data): relation_keys = [] rel = {} for i in range(len(flat_data)): rel[i] = [] for n in flat_data: if i in n: rel[i].extend(n) return {k:list(set(v)) for k,v in rel.items()} def process(self, flat_data): rawRelations = self.process_relation(flat_data) return rawRelations rel = Relations() print(json.dumps(rel.process(data), indent=4), file=open('out.json', 'w')) # https://files.catbox.moe/n65tie.json
NOTE – the largest number present in the data will be equal to the length of list of lists
Advertisement
Answer
A simple (and probably non-optimal) algorithm that modifies the input data in place:
target_idx = 0 while target_idx < len(data): src_idx = target_idx + 1 did_merge = False while src_idx < len(data): if set(data[target_idx]) & set(data[src_idx]): data[target_idx].extend(data[src_idx]) data.pop(src_idx) # this was merged did_merge = True continue # with same src_idx src_idx += 1 if not did_merge: target_idx += 1