I have a long list of hexahedral point coordinates, for example:
[[0, 57, 2948, 56, 449, 14953, 15002, 5446], [ 449, 14953, 15002, 5446, 450, 14954, 15003, 5495], [ 450, 14954, 15003, 5495, 451, 14955, 15004, 5544], [ 451, 14955, 15004, 5544, 452, 14956, 15005, 5593], [ 452, 14956, 15005, 5593, 453, 14957, 15006, 5642], ....]
Each row defines a hexahedron cell, and by iterating over each cell, I extract the defining faces of the cell (6 faces), and add each face to a list processed_faces
All of this is fine, but because some cells are sharing the same face, I needed a way to check if the current face has been processed before or not (also needed later for connectivity calculations, so no way out of this).
for cell_id, cell in enumerate(cells): faces = get_faces(cell) for face in faces: # have we met `face` before? if not self.face_exists(face): self.add_face(face) # link the face to the cell who shares it self.link_face_to_cell(face, cell_id)
The bottleneck of my code is face_exists()
, for which I tried two different approaches:
Sort each processed face, convert it to a tuple (to be hashable), add the tuple to a list of tuples, and simple check if a face exists by
tuple(sorted(face)) in faces
. But this is AWFULLY SLOW.Implement a trie data structure, which works just fine (and about 100x faster than method 1), but I am not satisfied with the performance
class TrieNode: __slots__ = ['index', 'is_end', 'children'] def __init__(self, index: int): self.index = index self.is_end = False self.children = {} class Trie: __slots__ = ['root'] def __init__(self): self.root = TrieNode(-1) def insert(self, face: List[int]): node = self.root for point in face: if point in node.children: node = node.children[point] else: new_node = TrieNode(point) node.children[point] = new_node node = new_node node.is_end = True def search(self, face: List[int]) -> bool: node =self.root for point in face: if point in node.children: node = node.children[point] continue else: return False if node.is_end: return True return False
Sorry for the long question, in summary I am looking for two things:
- An alternative data structure for mesh storage, that allows my fast search needs. OR:
- If trie is suitable for my application, is there a python library (hopefully with a C backend) that allows storing list of numbers, because libraries I came across are mostly designed for strings.
Advertisement
Answer
One of possible drop-in solutions would be to use a set
/frozenset
for storing faces. Sets have great lookup performance and are simple to use here.