I have three 1D vectors. Let’s say T with 100k element array, f and df each with 200 element array:
T = [T0, T1, ..., T100k] f = [f0, f1, ..., f200] df = [df0, df1, ..., df200]
For each element array, I have to calculate a function such as the following:
P = T*f + T**2 *df
My first instinct was to use the NumPy outer to find the function with each combination of f and df
P1 = np.outer(f,T) P2 = np.outer(df,T**2) P = np.add.outer(P1, P2)
However, in this case, I am facing the ram issue and receiving the following error:
Unable to allocate 2.23 PiB for an array with shape (200, 100000, 200, 100000) and data type float64
Is there a good way that I can calculate this?
My attempt using for loops
n=100 f_range = 5e-7 df_range = 1.5e-15 fsrc = np.arange(f - n * f_range, f + n * f_range, f_range) #array of 200 dfsrc = np.arange(df - n * df_range, df + n * df_range, df_range) #array of 200 dfnus=pd.DataFrame(fsrc) numf=dfnus.shape[0] dfnudots=pd.DataFrame(dfsrc) numfdot=dfnudots.shape[0] test2D = np.zeros([numf,(numfdot)]) for indexf, f in enumerate(fsrc): for indexfd, fd in enumerate(dfsrc): a=make_phase(T,f,fd) #--> this is just a function that performs T*f + T**2 *df zlauf2d=z_n(a, n=1, norm=1) #---> And this is just another function that takes this 1D "a" and gives another 1D element array test2D[indexf, indexfd]=np.copy(zlauf2d) #---> I do this so I could make a contour plot at the end. It just copys the same thing to 2D
Now my test2D has the shape of (200,200). This is what I want, however the floor loop is taking ages and I want somehow reduce two for loop to at least one.
Advertisement
Answer
Using broadcasting:
P1 = (f[:, np.newaxis] * T).sum(axis=-1) P2 = (df[:, np.newaxis] * T**2).sum(axis=-1) P = P1[:, np.newaxis] + P2
Alternatively, using outer:
P1 = (np.outer(f, T)).sum(axis=-1) P2 = (np.outer(df, T**2)).sum(axis=-1) P = P1[..., np.newaxis] + P2
This produces an array of shape (f.size, df.size) == (200, 200)
.
Generally speaking, if the final output array size is very large, one can either:
- Reduce the size of the datatypes. One way is to change the datatypes of the arrays used to calculate the final output via
P1.astype(np.float32)
. Alternatively, some operations allow one to pass in adtype=np.float32
as a parameter. - Chunk the computation and work with smaller subsections of the result.
Based on the most recent edit, compute an array a
with shape (200, 200, 100000)
. Then, take its element-wise norm along the last axis to produce an array z
with shape (200, 200)
.
a = ( f[:, np.newaxis, np.newaxis] * T + df[np.newaxis, :, np.newaxis] * T**2 ) # L1 norm along last axis. z = np.abs(a).sum(axis=-1)
This produces an array of shape (f.size, df.size) == (200, 200)
.