I have one 2-dimensional numpy array and another 3D-dimensional array. For each number in the first array I would like to count how often this value or an extremer one appears in the second array (taking the 3rd dimension as comparison vector for each element in the first array). For 0 values the function should return np.nan
since it’s not possible here to decide if 0s should be compared to negative or positive numbers.
EDIT: With ‘extreme’ I mean that positive values in a should only be compared with positive values in b and negative values only with negative values in b.
Example:
import numpy as np np.random.seed(42) # this is the 2D array a = np.random.randint(low=-5, high=5, size=(5, 5)) # for each value in a, count how often this value or an extremer one appears # in b (taking the last dimension of b as comparison vectors) b = np.random.randint(low=-5, high=5, size=(5, 5, 5)) # expected result result = np.array([[2, 2, 1, 1, 2], [0, 1, 1, 3, 3], [1, 3, 2, 2, np.nan], [2, 0, 0, np.nan, 1], [3, 0, 1, np.nan, 3]])
Advertisement
Answer
For all these operations, you will want to transfrom a
into a 3D array to utilize broadcasting:
a3 = a[..., None]
You can use np.sign
to normalize the direction of the extrema:
s = np.sign(a3) a3 *= s b3 = b * s
Now all your extrema are positive, so you can count the number of times something is greater than or equal to the corresponding element of a3
:
result = (b3 >= a3).sum(axis=-1)
If you want to set zero elements to np.nan
, you will first need a floating point array. The simplest way to get one is to specify the dtype in the previous line:
result = (b3 >= a3).sum(axis=-1, dtype=float) result[a == 0] = np.nan
This can be written more concisely as:
s = np.sign(a)[..., None] result = (b * s >= a[..., None] * s).sum(axis=-1, dtype=float) result[a == 0] = np.nan