I’m looking for a function to compute the euclidian distance between a numpy array of points with two coordinates (x, y) and a line segment. My goal is to have a result in under 0.01 sec for a line segment and 10k points.
I already found a function for a single point. But running a for loop is very inefficient.
I also found this function that calculates the distance to the infinite line:
def line_dists(points, start, end): if np.all(start == end): return np.linalg.norm(points - start, axis=1) vec = end - start cross = np.cross(vec, start - points) return np.divide(abs(cross), np.linalg.norm(vec))
It is very efficient and I would like to have a similar approach for a bounded line.
Thank you for your help.
Advertisement
Answer
Setup – test point P
, endpoints A
and B
:
Take the dot-product of
P - A
withnormalize(A - B)
to obtain the signed parallel distance components
fromA
. Likewise withB
andt
.Take the maximum of these two numbers and zero to get the clamped parallel distance component. This will only be non-zero if the point is outside the “boundary” (Voronoi region?) of the segment.
Calculate the perpendicular distance component as before, using the cross-product.
Use Pythagoras to compute the required closest distance (gray line from
P
toA
).
The above is branchless and thus easy to vectorize with numpy
:
def lineseg_dists(p, a, b): # Handle case where p is a single point, i.e. 1d array. p = np.atleast_2d(p) # TODO for you: consider implementing @Eskapp's suggestions if np.all(a == b): return np.linalg.norm(p - a, axis=1) # normalized tangent vector d = np.divide(b - a, np.linalg.norm(b - a)) # signed parallel distance components s = np.dot(a - p, d) t = np.dot(p - b, d) # clamped parallel distance h = np.maximum.reduce([s, t, np.zeros(len(p))]) # perpendicular distance component, as before # note that for the 3D case these will be vectors c = np.cross(p - a, d) # use hypot for Pythagoras to improve accuracy return np.hypot(h, c)