I have an np.ndarray
of shape (5, 5, 2, 2, 2, 10, 8)
named table
. I can succesfully slice it like this:
table[4, [0, 1], 1, 1, 1, slice(0, 10, None), slice(0, 8, None)]
table[4, [0, 1], 1, 1, 1, [0, 2], slice(0, 8, None)]
But for some reason when I try to specify three values for dimension 5 (of length 10) like this:
table[4, [0, 1], 1, 1, 1, [0, 2, 6], slice(0, 8, None)]
I get:
>>> IndexError: shape mismatch: indexing arrays could not be broadcast together with shapes (2,) (3,)
The same is for:
table[4, [0, 1, 4], 1, 1, 1, [0, 2], slice(0, 8, None)]
This does not happen with:
table[4, [0, 1, 4], 1, 1, 1, slice(0, 10, None), slice(0, 8, None)]
table[4, [1, 0, 4], 1, 1, 1, slice(0, 10, None), slice(0, 8, None)]
which output the correct result.
I tried to read similar questions here on broadcasting but I was still confused why Numpy
can’t make sense of this slice notation. Why does it act all puzzled when I give it more than two points along an axis to slice with when there’s already another array in the indices?
Advertisement
Answer
In [219]: table = np.zeros((5, 5, 2, 2, 2, 10, 8),int) In [220]: table.shape Out[220]: (5, 5, 2, 2, 2, 10, 8)
The fact that you use slice
instead of :
doesn’t matter; same for the fact that the trailing slices don’t have to be specified.
In [221]: table[4, [0, 1], 1, 1, 1, slice(0, 10, None), slice(0, 8, None)].shape Out[221]: (2, 10, 8)
This has an advanced indexing array/list of length 2 – the other dimensions are either scalars or slices. So they disappear or ‘pass through’.
In [222]: table[4, [0, 1], 1, 1, 1, [0, 2], slice(0, 8, None)].shape Out[222]: (2, 8)
Here you have two advanced indexing lists – both length 2, so they ‘broadcast’ together to select 2 values (I think of this as a kind of ‘diagonal’).
In [223]: table[4, [0, 1, 4], 1, 1, 1, slice(0, 10, None), slice(0, 8, None)].shape Out[223]: (3, 10, 8)
Same as before but with a length 3 list.
But when the 2 lists have different length you get an error:
In [225]: table[4, [0, 1], 1, 1, 1, [0, 2, 6], slice(0, 8, None)].shape --------------------------------------------------------------------------- IndexError Traceback (most recent call last) Input In [225], in <cell line: 1>() ----> 1 table[4, [0, 1], 1, 1, 1, [0, 2, 6], slice(0, 8, None)].shape IndexError: shape mismatch: indexing arrays could not be broadcast together with shapes (2,) (3,)
If one list is (2,1), then it works – it selects 2 in one dimension, and 3 in the other:
In [226]: table[4, [[0], [1]], 1, 1, 1, [0, 2, 6], slice(0, 8, None)].shape Out[226]: (2, 3, 8)
In indexing, ‘broadcasting’ follows the same rules as when adding (or multiplying) arrays.
(2,) with (2,) => (2,) (2,1) with (3,) => (2,3) (2,) with (3,) error In [227]: np.ones(2)+np.ones(2) Out[227]: array([2., 2.]) In [228]: np.ones(2)+np.ones(3) --------------------------------------------------------------------------- ValueError Traceback (most recent call last) Input In [228], in <cell line: 1>() ----> 1 np.ones(2)+np.ones(3) ValueError: operands could not be broadcast together with shapes (2,) (3,) In [229]: np.ones((2,1))+np.ones(3) Out[229]: array([[2., 2., 2.], [2., 2., 2.]]
edit
Look at a simpler 2d array:
In [261]: arr = np.arange(6).reshape(2,3) In [262]: arr Out[262]: array([[0, 1, 2], [3, 4, 5]])
If I index with 2 (2,) arrays I get 2 values:
In [264]: arr[np.array([0,1]), np.array([1,2])] Out[264]: array([1, 5])
But if I index with a (2,1) and (2,), I get a (2,2) shape result. Note where the [1,5] values are:
In [265]: arr[np.array([0,1])[:,None], np.array([1,2])] Out[265]: array([[1, 2], [4, 5]])
ix_
is a handy tool for constructing such a “cartesian” set of indexing arrays. For example 3 lists I get:
In [266]: np.ix_([1,2],[3,4,5],[6,7]) Out[266]: (array([[[1]], [[2]]]), array([[[3], [4], [5]]]), array([[[6, 7]]])) In [267]: [i.shape for i in np.ix_([1,2],[3,4,5],[6,7])] Out[267]: [(2, 1, 1), (1, 3, 1), (1, 1, 2)]
Together those will select a block of shape (2,3,2) from a 3d (or larger) array.
Formally this is described in https://numpy.org/doc/stable/user/basics.indexing.html#advanced-indexing
(Your slices are all at the end. There is a nuance to this indexing when slices occur in the middle. See the subsection about Combining advanced and basic indexing
if that arises.)