score calculation takes too long: avoid for loops – python

I am new to python and I need your kindly help.

I have three matrices, in particular:

• Matrix M (class of the matrix: scipy.sparse.csc.csc_matrix), dimensions: N x C;
• Matrix G (class of the matrix: numpy.ndarray), dimensions: C x T;
• Matrix L (class of the matrix: numpy.ndarray), dimensions: T x N.

Where: N = 10000, C = 1000, T = 20.

I would like to calculate, this score:

$\sum_{i=1}^{N}(\sum_{c=1}^{C}M_{i,c}\cdot&space;\sum_{t=1}^{T}&space;L_{t,i}\cdot&space;G_{c,t})$

I tried by using two `for` loops , one for the `i`-index and one for `c`. Furthermore, I used a `dot` product for obtaining the last sum in the equation. But my implementation requires too much times for giving the result.

This is what I implemented:

```    score = 0.0
for i in range(N):
for c in range(C):
Mic = M[i,c]
score += np.outer(Mic,(np.dot(L[:,i],G[c,:])))
```

Is there a way to avoid the two `for` loops?

Best

Try this `score = np.einsum("ic,ti,ct->", M, L, G)`
By the way, in your case, `score = np.sum(np.diag(M @ G @ L))` (in PYTHON3 starting from version 3.5, you can use the semantics of the `@` operator for `matmul` function) is faster than `einsum` (especially in `np.trace((L @ M) @ G )` due to efficient use of memory, maybe @hpaulj meant this in his comment). But `einsum` is easier to use for complex tensor products (to encode with `einsum` I used your math expression directly without thinking about optimization).
Generally, using `for` with `numpy` results in a dramatic slowdown in computation speed (think “vectorize your computations” in the case of `numpy`).