Compute row distance matrix using only for loops

Question

I am stuck in trying to calculate a distance matrix from different binary arrays and I can only use for loops to resolve this&#8230; The problem consists of the following; Imagine I have a binary matrix built with different rows as follows, with dimension n=3,m=3 in this case: And I would like to achieve the …

Accepted Answer

If I understood correctly, you could do:import numpy as npmat = np.matrix([[0,0,0],                 [1,0,1],                 [1,1,1]])result = np.zeros(np.shape(mat))nrows, ncols = np.shape(mat)for r in range(nrows):    # We only need to compare the upper triangular part of the matrix.     for i in range(r+1, nrows):        for j in range(ncols):            result[r, i] += mat[r, j] != mat[i, j]            # Here we copy the upper triangular part to lower triangular to make it symmetric.result = result + result.T print(result)array([[0, 2, 3],       [2, 0, 1],       [3, 1, 0]])If you can at least use some numpy functions:# You can also iterate matrices row by row.for i, row in enumerate(mat):    # Sum differences. mat != row already calculates the differences with the whole matrix.    result[i, :] = np.sum(mat != row, axis=1).transpose()print(result)array([[0, 2, 3],       [2, 0, 1],       [3, 1, 0]])In case you want to see a neat trick, here is how you could do it without iterating with a for loop. The following code is using &#8220;broadcasting&#8221;. We add a dimension to the array so that the comparison is automatically done using each row:# For this trick we need to convert the matrix to an array.mat_arr = np.asarray(mat)result_broadcasting = np.sum(mat_arr != mat_arr[:, None], axis=2)print(result_broadcasting)array([[0, 2, 3],       [2, 0, 1],       [3, 1, 0]])

Advertisement

Answer