Skip to content
Advertisement

Check if two numpy arrays are identical

Suppose I have a bunch of arrays, including x and y, and I want to check if they’re equal. Generally, I can just use np.all(x == y) (barring some dumb corner cases which I’m ignoring now).

However this evaluates the entire array of (x == y), which is usually not needed. My arrays are really large, and I have a lot of them, and the probability of two arrays being equal is small, so in all likelihood, I really only need to evaluate a very small portion of (x == y) before the all function could return False, so this is not an optimal solution for me.

I’ve tried using the builtin all function, in combination with itertools.izip: all(val1==val2 for val1,val2 in itertools.izip(x, y))

However, that just seems much slower in the case that two arrays are equal, that overall, it’s stil not worth using over np.all. I presume because of the builtin all‘s general-purposeness. And np.all doesn’t work on generators.

Is there a way to do what I want in a more speedy manner?

I know this question is similar to previously asked questions (e.g. Comparing two numpy arrays for equality, element-wise) but they specifically don’t cover the case of early termination.

Advertisement

Answer

Until this is implemented in numpy natively you can write your own function and jit-compile it with numba:

import numpy as np
import numba as nb


@nb.jit(nopython=True)
def arrays_equal(a, b):
    if a.shape != b.shape:
        return False
    for ai, bi in zip(a.flat, b.flat):
        if ai != bi:
            return False
    return True


a = np.random.rand(10, 20, 30)
b = np.random.rand(10, 20, 30)


%timeit np.all(a==b)  # 100000 loops, best of 3: 9.82 µs per loop
%timeit arrays_equal(a, a)  # 100000 loops, best of 3: 9.89 µs per loop
%timeit arrays_equal(a, b)  # 100000 loops, best of 3: 691 ns per loop

Worst case performance (arrays equal) is equivalent to np.all and in case of early stopping the compiled function has the potential to outperform np.all a lot.

User contributions licensed under: CC BY-SA
1 People found this is helpful
Advertisement