Skip to content
Advertisement

Python object array of empty arrays

I am attempting to create a numpy array of empty arrays without using loops. Using loops, I can use a simplified operation like

a = np.empty((3, 3), object)
for i in range(a.size):
    a.ravel()[i] = np.array([])

Or a marginally more sophisticated approach based on np.nditer:

a = np.empty((3, 3), object)
it = np.nditer(a, flags=['multi_index', 'refs_ok'])
for i in it:
    a[it.multi_index] = np.array([])

I can’t seem to find an indexing expression that will allow me to make this assignment in a vectorized manner.

I’ve tried the following:

>>> a[:] = np.zeros((0, 3, 3))
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: could not broadcast input array from shape (0,3,3) into shape (3,3)
>>> a[..., None] = np.zeros((3, 3, 0))
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: could not broadcast input array from shape (3,3,0) into shape (3,3,1)
>>> a = np.full((3, 3), np.array([]), object)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib/python3.9/site-packages/numpy/core/numeric.py", line 343, in full
    multiarray.copyto(a, fill_value, casting='unsafe')
  File "<__array_function__ internals>", line 5, in copyto
ValueError: could not broadcast input array from shape (0,) into shape (3,3)

Even np.nditer does not allow me to write back:

>>> a = np.empty((3, 3), object)
>>> for x in np.nditer(a, flags=['refs_ok'], op_flags=['readwrite']):
...    x[...] = np.array([])
Traceback (most recent call last):
  File "<stdin>", line 2, in <module>
ValueError: could not broadcast input array from shape (0,) into shape ()

Is there any way to make an object array of empty arrays in a vectorized manner?

The exact output I’m looking for is

array([[array([], dtype=float64), array([], dtype=float64), array([], dtype=float64)],
       [array([], dtype=float64), array([], dtype=float64), array([], dtype=float64)],
       [array([], dtype=float64), array([], dtype=float64), array([], dtype=float64)]], dtype=object)

For the purposes of this question, I don’t care if the references are all the same array or different arrays.

Advertisement

Answer

You can get NumPy’s broadcasting handling to treat an array as a scalar object instead of a source to broadcast from, by putting it inside another array:

template = numpy.empty((), dtype=object)
element = numpy.array([], dtype=float)
template[()] = element

Then you can perform a broadcasted assignment to store the element array in every cell of a result array:

result = numpy.empty((3, 3), dtype=object)
result[:] = template

The broadcasting logic will broadcast over template instead of element. This results in a 3-by-3 result array of object dtype, where every cell holds a reference to the single element array.

result = numpy.full((3, 3), template) would also work, but it seems even more confusing than the slice assignment, and the slice assignment is already pretty confusing – it’s not immediately obvious whether the cells of result end up holding references to template or element.

User contributions licensed under: CC BY-SA
2 People found this is helpful
Advertisement