# Convert x same size 2D numpy arrays to a 2+x column data frame

I have two ndarrays of size (m x n), and two lists of length m and n respectively. I want to convert the two matrices to a dataframe with four columns. The first two columns correspond to the m and n dimensions, and contain the values from the lists. The next two columns should contain the values from the two matrices. In total, the resulting dataframe should have m times n rows.

Example: If these are the two matrices and two lists,

```a1 = np.array([[1, 2], [3, 4],[5,6]])
a2 = np.array([[10, 20], [30, 40],[50,60]])
l1 = [5,7,99]
l2 = [2,3]
```

then the resulting dataframe should look like this:

```"l1" "l2" "a1" "a2"
5    2    1    10
7    2    3    30
99   2    5    50
5    3    2    20
7    3    4    40
99   3    6    60
```

The order of the rows does not matter.

Although I only have two matrices in this specific case, I am curious about a solution which is easily applicable to any number of same size matrices.

Use `np.vstack` for join arrays created by `numpy.tile`, `numpy.repeat` and `numpy.ravel` and pass to `DataFrame` cosntructor:

```a = np.vstack((np.tile(l1, len(l2)),
np.repeat(l2, len(l1)),
np.ravel(a1, 'F'),
np.ravel(a2, 'F'))).T
print (a)
[[ 5  2  1 10]
[ 7  2  3 30]
[99  2  5 50]
[ 5  3  2 20]
[ 7  3  4 40]
[99  3  6 60]]

df = pd.DataFrame(a, columns=['l1','l2','a1','a2'])
print (df)
l1  l2  a1  a2
0   5   2   1  10
1   7   2   3  30
2  99   2   5  50
3   5   3   2  20
4   7   3   4  40
5  99   3   6  60
```

For multiple arrays:

```arrays =  [a1, a2]

arr = [np.ravel(a, 'F') for a in arrays]
a = np.vstack((np.tile(l1, len(l2)),
np.repeat(l2, len(l1)),
arr)).T
print (a)
[[ 5  2  1 10]
[ 7  2  3 30]
[99  2  5 50]
[ 5  3  2 20]
[ 7  3  4 40]
[99  3  6 60]]

df = pd.DataFrame(a, columns=['l1','l2'] + [f'a{x+1}' for x in range(len(arrays))])
print (df)
l1  l2  a1  a2
0   5   2   1  10
1   7   2   3  30
2  99   2   5  50
3   5   3   2  20
4   7   3   4  40
5  99   3   6  60
```

Pandas only solution with `concat` and `DataFrame.unstack`:

```df = (pd.concat([pd.DataFrame(a1, columns=l2, index=l1).unstack(),
pd.DataFrame(a2, columns=l2, index=l1).unstack()],
axis=1, keys=['a1','a2'])
.rename_axis(['l2','l1']).swaplevel(1,0).reset_index())
print (df)
l1  l2  a1  a2
0   5   2   1  10
1   7   2   3  30
2  99   2   5  50
3   5   3   2  20
4   7   3   4  40
5  99   3   6  60
```

For multiple arrays:

```arrays =  [a1, a2]
df = (pd.concat([pd.DataFrame(a, columns=l2, index=l1).unstack() for a in arrays],
axis=1)
.rename_axis(['l2','l1'])
.swaplevel(1,0)
.rename(columns=lambda x: f'a{x+1}')
.reset_index())
print (df)
l1  l2  a1  a2
0   5   2   1  10
1   7   2   3  30
2  99   2   5  50
3   5   3   2  20
4   7   3   4  40
5  99   3   6  60
```
6 People found this is helpful