Skip to content
Advertisement

Indexing with boolean arrays into multidimensional arrays using numpy

I am new to using numpy and one thing that I really don’t understand is indexing arrays.

In the tentative tutorial there is this example:

>>> a = arange(12).reshape(3,4)
>>> b1 = array([False,True,True])             # first dim selection
>>> b2 = array([True,False,True,False])       # second dim selection
>>>
>>> a[b1,b2]                                  # a weird thing to do
array([ 4, 10])

I have no idea why it does that last thing. Can anyone explain that to me?

Thanks!

Advertisement

Answer

Your array consists of:

0  1  2  3
4  5  6  7
8  9 10 11

One way of indexing it would be using a list of integers, specifying which rows/columns to include:

>>> i1 = [1,2]
>>> i2 = [0,2]
>>> a[i1,i2]
array([ 4, 10])

Meaning: row 1 column 0, row 2 column 2

When you’re using boolean indices, you’re telling which rows/columns to include and which ones not to:

>>> b1 = [False,True,True]       # 0:no,  1:yes, 2:yes       ==> [1,2]
>>> b2 = [True,False,True,False] # 0:yes, 1:no,  2:yes, 3:no ==> [0,2]

As you can see, this is equivalent to the i1 and i2 shown above. Hence, a[b1,b2] will have the same result.

Note also that the operation above is only possible because both b1 and b2 have the same number of True values (so, they represent two arrays of the same length when expressed in the integer form).

User contributions licensed under: CC BY-SA
9 People found this is helpful
Advertisement