I want to contrsuct the following matrix :
[v0 v1 v2 v3 .... v(M-d+1) v1 . v2 . . . . . vd . . v(M) ]
where each v(k) is a (ndarray) vector, say from a matrix
X = np.random.randn(100, 8) M = 7 d = 3 v0 = X[:, 0] v1 = X[:, 1] ...
Using a for loop, I can do something like this for example:
v1 = np.array([1, 2, 3]).reshape((-1, 1)) v2 = np.array([10, 20, 30]).reshape((-1, 1)) v3 = np.array([100, 200, 300]).reshape((-1, 1)) v4 = np.array([100.1, 200.1, 300.1]).reshape((-1, 1)) v5 = np.array([1.1, 2.2, 3.3]).reshape((-1, 1)) X = np.hstack((v1, v2, v3, v4, v5)) d = 2 X_ = np.zeros((d * X.shape[0], X.shape[1]+1-d)) for i in range (d): X_[i*X.shape[0]:(i+1) * X.shape[0], :] = X[:X.shape[0], i:i+(X.shape[1]+1-d)]
And I get :
X_ = array([[ 1. , 10. , 100. , 100.1], [ 2. , 20. , 200. , 200.1], [ 3. , 30. , 300. , 300.1], [ 10. , 100. , 100.1, 1.1], [ 20. , 200. , 200.1, 2.2], [ 30. , 300. , 300.1, 3.3]]) #Which is the wanted matrix
Is there any way to construct this matrix in a vectorized way (which I imagine would be faster than for loops when it comes to large matrices ?).
Thank you.
Advertisement
Answer
This looks about optimal; you did a good job vectorizing it already. The only improvement I can make is to replace np.zeros
with np.empty
, which skips initializing the array. I tried using np.vstack
and np.lib.stride_tricks.sliding_window_view
(after https://stackoverflow.com/a/60581287) and got the same performance as the for loop with np.empty
.
# sliding window: X_ = np.lib.stride_tricks.sliding_window_view(X, (X.shape[0], X.shape[1]+1-d)).reshape(d*X.shape[0], -1) # np.vstack: X_ = np.vstack([X[:, i:i+(X.shape[1]+1-d)] for i in range(d)])