I would like to remove the coordinates that lie close to each other or if it is just a duplicate.
For example,
x = [[9, 169], [5, 164],[340,210],[1020,102],[210,312],[12,150]]
In the above list, the first and second element lies close to each other. How do I remove the second element while preserving the first one?
Following is what I have tried,
def process(input_list, thresh=(10, 10)): buffer = input_list.copy() n = 0 prev_cx, prev_cy = 0, 0 for i in range(len(input_list)): elem = input_list[i] cx, cy = elem if n == 0: prev_cx, prev_cy = cx, cy else: ab_cx, ab_cy = abs(prev_cx - cx), abs(prev_cy - cy) if ab_cx <= thresh[0] and ab_cy <= thresh[1]: del buffer[i] n += 1 return buffer x = [[9, 169], [5, 164], [340, 210], [1020, 102], [210, 312], [12, 150]] processed = process(x) print(processed)
The problem is that it doesn’t recursively check if there are any other duplicates since it only checks the adjacent coordinates. What is an efficient way of filtering the coordinate?
Sample Input with thresh = (10,10):
x = [[12,24], [5, 12],[100,1020], [20,30], [121,214], [15,12]]
Sample output:
x = [[12,24],[100,1020], [121,214]]
Advertisement
Answer
Your question is a bit vague, but I’m taking it to mean:
- You want to compare all combinations of points
- If a combination contains points closer than a threshold
- Then remove the point further from the start of the input list
Try this:
import itertools def process(input_list, threshold=(10,10)): combos = itertools.combinations(input_list, 2) points_to_remove = [point2 for point1, point2 in combos if abs(point1[0]-point2[0])<=threshold[0] and abs(point1[1]-point2[1])<=threshold[1]] points_to_keep = [point for point in input_list if point not in points_to_remove] return points_to_keep coords = [[12,24], [5, 12],[100,1020], [20,30], [121,214], [15,12]] print(process(coords)) >>> [[12, 24], [5, 12], [100, 1020], [121, 214]]
The way this works is to generate all combinations of points using itertools (which leaves the points in the original order), and then create a list of points to remove using the threshold. Then it returns a list of points not in that list of points to remove.
You’ll notice that I have an extra point than you. I simply copied what seemed to be your intended functionality (i.e. both dy AND dx <= thresh for point removal). However, if I change the line with the AND statement to remove point if dy OR dx <= thresh, I get the same output as your sample.
So I’m going to ask you to recheck your sample output.
BTW, it might be useful for you to confirm if checking for x and y proximity separately is what you really want. So as a bonus, I’ve included a version using the Euclidean distance as well:
import itertools import math def process(input_list, threshold=100): combos = itertools.combinations(input_list, 2) points_to_remove = [point2 for point1, point2 in combos if math.dist(point1, point2)<=threshold] points_to_keep = [point for point in input_list if point not in points_to_remove] return points_to_keep coords = [[12,24], [5, 12],[100,1020], [20,30], [121,214], [15,12]] print(process(coords)) >>> [[12, 24], [100, 1020], [121, 214]]
This version fits your original sample when I used a threshold radius of 100.