I am looking for a better way of calculating the following
import numpy as np
np.random.seed(123)
# test code
t = np.random.randint(3, size = 100)
X = np.random.random((100, 3))
m = np.random.random((3, 3))
# current method
res = 0
for k in np.unique(t):
for row in X[t == k] - m[k]:
res += np.outer(row, row)
res
"""
Output:
array([[12.45661335, -3.51124346, 3.75900294],
[-3.51124346, 14.85327689, -3.02281263],
[ 3.75900294, -3.02281263, 18.30868772]])
"""
I would prefer getting rid of the for loops using numpy.
This is the within-class scatter matrix for fischers linear discriminant.
Advertisement
Answer
You can write as follows:
Y = X - m[t] np.matmul(Y.T, Y)
This is because sum_i x_i x'_i = X' X, where X is (N, 3) matrix and x_i = X[i,:], i.e. i-th row of X. ' indicates the transpose.