Skip to content
Advertisement

More pythonic way of creating within-class scatter matrix

I am looking for a better way of calculating the following

import numpy as np
np.random.seed(123)

# test code
t = np.random.randint(3, size = 100)
X = np.random.random((100, 3))
m = np.random.random((3, 3))

# current method
res = 0
for k in np.unique(t):
    for row in X[t == k] - m[k]:
        res += np.outer(row, row)
res
"""
Output:
array([[12.45661335, -3.51124346,  3.75900294],
       [-3.51124346, 14.85327689, -3.02281263],
       [ 3.75900294, -3.02281263, 18.30868772]])
"""

I would prefer getting rid of the for loops using numpy.

This is the within-class scatter matrix for fischers linear discriminant.

Advertisement

Answer

You can write as follows:

Y = X - m[t]
np.matmul(Y.T, Y)

This is because sum_i x_i x'_i = X' X, where X is (N, 3) matrix and x_i = X[i,:], i.e. i-th row of X. ' indicates the transpose.

User contributions licensed under: CC BY-SA
7 People found this is helpful
Advertisement