Skip to content
Advertisement

Question about turning a list of numpy arrays into an object array

I have a question about turning a list of numpy arrays into an object array.

import numpy as np

testing_1=[np.array([1]),np.array([2]),np.array([3]),np.array([4]),np.array([np.nan])]
testing_1_array=np.asarray(testing_1, dtype=object)

testing_2=[np.array([1]),np.array([2,3]),np.array([4]),np.array([np.nan])]
testing_2_array=np.asarray(testing_2, dtype=object)

This results in two very different outcomes:

testing_1_array
Out[12]: 
array([[1],
       [2],
       [3],
       [4],
       [nan]], dtype=object)

testing_2_array
Out[13]: array([array([1]), array([2, 3]), array([4]), array([nan])], dtype=object)

I assume that the difference comes from the fact that in testing_2_array not all arrays have the same size. Is there any way to force numpy to output testing_1_array in the same way as testing_2_output so that I do not have to additionally check if all arrays in the initial list have the same size?

Advertisement

Answer

np.array tries, where possible to make a multidimensional numeric dtype array. Creating a ragged object dtype array is a fall back option. And with some combinations of shapes, even that raises an error. Specifying object dtype doesn’t change that fundamental behavior.

Creating a “empty” array and filling it is the most general option.

In [272]: arr = np.empty(5,object)      # filled with None
In [273]: arr[:] = [np.array([1]),np.array([2]),np.array([3]),np.array([4]),np.a
     ...: rray([np.nan])]
In [274]: arr
Out[274]: 
array([array([1]), array([2]), array([3]), array([4]), array([nan])],
      dtype=object)

It also works with the ragged shape:

In [276]: arr = np.empty(4,object)
In [277]: arr[:] = [np.array([1]),np.array([2,3]),np.array([4]),np.array([np.nan
     ...: ])]
In [278]: arr
Out[278]: array([array([1]), array([2, 3]), array([4]), array([nan])], dtype=object)
User contributions licensed under: CC BY-SA
10 People found this is helpful
Advertisement