JavaScript
x
4
1
data = np.arange(10)
2
n = len(data)
3
np.asarray([np.sum((data[0:i]-np.mean(data[0:i]))**2) for i in range(1,n)])
4
Can this for loop be vectorized maybe by expanding dimensions and then collapsing it?
I got the hint from somewhere that I can replace
JavaScript
1
2
1
np.mean(data[0:i])
2
with
JavaScript
1
2
1
np.cumsum(data[0:n-1])/(np.arange(n-1)+1)
2
Advertisement
Answer
It can be vectorized by expanding dimensions as you suggested. I think the secret sauce is using np.tril to zero out terms in the progression before summing:
JavaScript
1
13
13
1
# calculate means using cumsum
2
mean = np.cumsum(data) / np.arange(1, n+1)
3
4
# expand into 2 dimensions
5
mean_2d = np.repeat(mean, n).reshape(n, n)
6
data_2d = np.tile(data, n).reshape(n, n)
7
8
# zero out unneeded terms
9
diff_squared = np.tril((data_2d-mean_2d)**2)
10
11
# sum along rows
12
np.sum(diff_squared, axis=1)
13