Getting the coordinates of elements in clusters without a loop in numpy

Question

I have a 2D array, where I label clusters using the ndimage.label() function like this: I can get the element counts, the centroids or the bounding box of the labeled clusters. But I would like to also get the coordinates of each element in clusters. Something like this (the data structure doesn't have to be like this, any data structure

Accepted Answer

You can make a map of the coordinates, sort and split it:# Get the indexes (coordinates) of the labeled (non-zero) elementsind = np.argwhere(labeled_array)# Get the labels corresponding to those indexes abovelabels = labeled_array[tuple(ind.T)]# Sort both arrays so that lower label numbers appear before higher label numbers. This is not for cosmetic reasons,# but we will use sorted nature of these label indexes when we use the "diff" method in the next step.sort = labels.argsort()ind = ind[sort]labels = labels[sort]# Find the split points where a new label number starts in the ordered label numberssplits = np.flatnonzero(np.diff(labels)) + 1# Create a data structure out of the label numbers and indexes (coordinates).# The first argument to the zip is: we take the 0th label number and the label numbers at the split points# The second argument is the indexes (coordinates), split at split points# so the length of both arguments to the zip function is the sameresult = {k: v for k, v in zip(labels[np.r_[0, splits]],                               np.split(ind, splits))}

Advertisement

Answer