I have a pandas DataFrame with a structure like this:
JavaScript
x
6
1
In [22]: df
2
Out[22]:
3
a b
4
0 [1, 2, 3] [4, 5, 6]
5
1 [7, 8, 9] [10, 11, 12]
6
(to build it, do something like
JavaScript
1
6
1
df = pd.DataFrame([[object(), object()], [object(), object()]], columns=["a", "b"])
2
df.iat[0, 0] = [1, 2, 3]
3
df.iat[0, 1] = [4, 5, 6]
4
df.iat[1, 0] = [7, 8, 9]
5
df.iat[1, 1] = [10, 11, 12]
6
What would be the simplest way to turn it into a NumPy 3-dimensional array? This would be the expected result:
JavaScript
1
28
28
1
In [20]: arr
2
Out[20]:
3
array([[[ 1, 2, 3],
4
[ 4, 5, 6]],
5
6
[[ 7, 8, 9],
7
[10, 11, 12]]])
8
9
In [21]: arr.shape
10
Out[21]: (2, 2, 3)
11
12
In [22]: df.iloc[0, 0]
13
Out[22]: [1, 2, 3]
14
15
In [23]: arr[0, 0]
16
Out[23]: array([1, 2, 3])
17
18
In [24]: df.iloc[-1]
19
Out[24]:
20
a [7, 8, 9]
21
b [10, 11, 12]
22
Name: 1, dtype: object
23
24
In [25]: arr[-1]
25
Out[25]:
26
array([[ 7, 8, 9],
27
[10, 11, 12]])
28
I have tried several things, without success:
JavaScript
1
26
26
1
In [6]: df.values # Notice the dtype
2
Out[6]:
3
array([[list([1, 2, 3]), list([4, 5, 6])],
4
[list([7, 8, 9]), list([10, 11, 12])]], dtype=object)
5
6
In [7]: df.values.astype(int)
7
---------------------------------------------------------------------------
8
TypeError Traceback (most recent call last)
9
TypeError: int() argument must be a string, a bytes-like object or a real number, not 'list'
10
11
The above exception was the direct cause of the following exception:
12
13
ValueError Traceback (most recent call last)
14
Input In [7], in <cell line: 1>()
15
----> 1 df.values.astype(int)
16
17
ValueError: setting an array element with a sequence.
18
19
In [14]: df.values.reshape(2, 2, -1)
20
Out[14]:
21
array([[[list([1, 2, 3])],
22
[list([4, 5, 6])]],
23
24
[[list([7, 8, 9])],
25
[list([10, 11, 12])]]], dtype=object)
26
Advertisement
Answer
One option is to convert df
to a list; then cast to numpy array:
JavaScript
1
2
1
out = np.array(df.to_numpy().tolist())
2
Output:
JavaScript
1
17
17
1
>>> out
2
array([[[ 1, 2, 3],
3
[ 4, 5, 6]],
4
5
[[ 7, 8, 9],
6
[10, 11, 12]]])
7
8
>>> out.shape
9
(2, 2, 3)
10
11
>>> out[0,0]
12
array([1, 2, 3])
13
14
>>> out[-1]
15
array([[ 7, 8, 9],
16
[10, 11, 12]])
17