I have a question about turning a list of numpy arrays into an object array.
import numpy as np testing_1=[np.array([1]),np.array([2]),np.array([3]),np.array([4]),np.array([np.nan])] testing_1_array=np.asarray(testing_1, dtype=object) testing_2=[np.array([1]),np.array([2,3]),np.array([4]),np.array([np.nan])] testing_2_array=np.asarray(testing_2, dtype=object)
This results in two very different outcomes:
testing_1_array Out[12]: array([[1], [2], [3], [4], [nan]], dtype=object) testing_2_array Out[13]: array([array([1]), array([2, 3]), array([4]), array([nan])], dtype=object)
I assume that the difference comes from the fact that in testing_2_array not all arrays have the same size. Is there any way to force numpy to output testing_1_array in the same way as testing_2_output so that I do not have to additionally check if all arrays in the initial list have the same size?
Advertisement
Answer
np.array
tries, where possible to make a multidimensional numeric dtype array. Creating a ragged object dtype array is a fall back option. And with some combinations of shapes, even that raises an error. Specifying object dtype doesn’t change that fundamental behavior.
Creating a “empty” array and filling it is the most general option.
In [272]: arr = np.empty(5,object) # filled with None In [273]: arr[:] = [np.array([1]),np.array([2]),np.array([3]),np.array([4]),np.a ...: rray([np.nan])] In [274]: arr Out[274]: array([array([1]), array([2]), array([3]), array([4]), array([nan])], dtype=object)
It also works with the ragged shape:
In [276]: arr = np.empty(4,object) In [277]: arr[:] = [np.array([1]),np.array([2,3]),np.array([4]),np.array([np.nan ...: ])] In [278]: arr Out[278]: array([array([1]), array([2, 3]), array([4]), array([nan])], dtype=object)