Outer product of large vectors exceeds memory

Question

I have three 1D vectors. Let's say T with 100k element array, f and df each with 200 element array: For each element array, I have to calculate a function such as the following: My first instinct was to use the NumPy outer to find the function with each combination of f and df However, in this case, I am

Accepted Answer

Using broadcasting:P1 = (f[:, np.newaxis] * T).sum(axis=-1)P2 = (df[:, np.newaxis] * T**2).sum(axis=-1)P = P1[:, np.newaxis] + P2Alternatively, using outer:P1 = (np.outer(f, T)).sum(axis=-1)P2 = (np.outer(df, T**2)).sum(axis=-1)P = P1[..., np.newaxis] + P2This produces an array of shape (f.size, df.size) == (200, 200).Generally speaking, if the final output array size is very large, one can either:Reduce the size of the datatypes. One way is to change the datatypes of the arrays used to calculate the final output via P1.astype(np.float32). Alternatively, some operations allow one to pass in a dtype=np.float32 as a parameter.Chunk the computation and work with smaller subsections of the result.Based on the most recent edit, compute an array a with shape (200, 200, 100000). Then, take its element-wise norm along the last axis to produce an array z with shape (200, 200).a = (    f[:, np.newaxis, np.newaxis] * T    + df[np.newaxis, :, np.newaxis] * T**2)# L1 norm along last axis.z = np.abs(a).sum(axis=-1)This produces an array of shape (f.size, df.size) == (200, 200).

Advertisement

Answer