I have a long list of hexahedral point coordinates, for example:
[[0, 57, 2948, 56, 449, 14953, 15002, 5446],
[ 449, 14953, 15002, 5446, 450, 14954, 15003, 5495],
[ 450, 14954, 15003, 5495, 451, 14955, 15004, 5544],
[ 451, 14955, 15004, 5544, 452, 14956, 15005, 5593],
[ 452, 14956, 15005, 5593, 453, 14957, 15006, 5642],
.]
Each row defines a hexahedron cell, and by iterating over each cell, I extract the defining faces of the cell (6 faces), and add each face to a list processed_faces
All of this is fine, but because some cells are sharing the same face, I needed a way to check if the current face has been processed before or not (also needed later for connectivity calculations, so no way out of this).
for cell_id, cell in enumerate(cells):
faces = get_faces(cell)
for face in faces:
# have we met `face` before?
if not self.face_exists(face):
self.add_face(face)
# link the face to the cell who shares it
self.link_face_to_cell(face, cell_id)
The bottleneck of my code is face_exists()
, for which I tried two different approaches:
Sort each processed face, convert it to a tuple (to be hashable), add the tuple to a list of tuples, and simple check if a face exists by
tuple(sorted(face)) in faces
. But this is AWFULLY SLOW.Implement a trie data structure, which works just fine (and about 100x faster than method 1), but I am not satisfied with the performance
JavaScript142421class TrieNode:
2__slots__ = ['index', 'is_end', 'children']
3def __init__(self, index: int):
4self.index = index
5self.is_end = False
6self.children = {}
7
8
9class Trie:
10__slots__ = ['root']
11def __init__(self):
12self.root = TrieNode(-1)
13
14def insert(self, face: List[int]):
15node = self.root
16
17for point in face:
18if point in node.children:
19node = node.children[point]
20else:
21new_node = TrieNode(point)
22node.children[point] = new_node
23node = new_node
24
25node.is_end = True
26
27
28def search(self, face: List[int]) -> bool:
29node =self.root
30
31for point in face:
32if point in node.children:
33node = node.children[point]
34continue
35else:
36return False
37
38if node.is_end:
39return True
40
41return False
42
Sorry for the long question, in summary I am looking for two things:
- An alternative data structure for mesh storage, that allows my fast search needs. OR:
- If trie is suitable for my application, is there a python library (hopefully with a C backend) that allows storing list of numbers, because libraries I came across are mostly designed for strings.
Advertisement
Answer
One of possible drop-in solutions would be to use a set
/frozenset
for storing faces. Sets have great lookup performance and are simple to use here.