Skip to content
Advertisement

Centering matrix

I want to write a function for centering an input data matrix by multiplying it with the centering matrix. The function shall subtract the row-wise mean from the input.

My code:

import numpy as np

def centering(data):
  n = data.shape()[0]
  centeringMatrix = np.identity(n) - 1/n * (np.ones(n) @ np.ones(n).T)
  data = centeringMatrix @ data


data = np.array([[1,2,3], [3,4,5]])
center_with_matrix(data)

But I get a wrong result matrix, it is not centered.

Thanks!

Advertisement

Answer

The centering matrix is

np.eye(n) - np.ones((n, n)) / n

Here is a list of issues in your original formulation:

  1. np.ones(n).T is the same as np.ones(n). The transpose of a 1D array is a no-op in numpy. If you want to turn a row vector into a column vector, add the dimension explicitly:

    np.ones((n, 1))
    

    OR

    np.ones(n)[:, None]
    
  2. The normal definition is to subtract the column-wise mean, not the row-wise, so you will have to transpose and right-multiply the input to get row-wise operation:

    n = data.shape()[1]
    ...
    data = (centeringMatrix @ data.T).T
    
  3. Your function creates a new array for the output but does not currently return anything. You can either return the result, or perform the assignment in-place:

    return (centeringMatrix @ data.T).T
    

    OR

    data[:] = (centeringMatrix @ data.T).T
    

    OR

    np.matmul(centeringMatrix, data.T, out=data.T)
    
User contributions licensed under: CC BY-SA
6 People found this is helpful
Advertisement