I’m creating a program that takes use of an RGB image that is tiled of the shape (n, n, 3, l, l). n is the number of each tile on each side of the image, and l is the length of each tile in pixels. I am trying to reshape this into a (3, l * n, l * n) shape. An example would be shape (7, 7, 3, 224, 224) to (3, 224, 224).
I want to keep the positions of the pixels in the new matrix, so I can visualise this image later. If I start with an image of a checkerboard pattern (every other tile has all pixel values set to 1, see example below), and use .reshape((3, 224, 224)) the result is the following:
Checkerboard (wanted result)
Wrong way of reshaping
I have made this for loop method of merging the tiles, which works, but is quite slow:
# l: the pixel length of each tile img_reshaped = torch.zeros((3, 224, 224)) for i in range(len(img)): for j in range(len(img[i])): img_reshaped[:, i * l:(i + 1) * l, j * 32:(j + 1) * l] = noise[i, j]
I’ve also tried using .fold(), but this only works with 3D matrices, and not 5D.
Any tips on how to solve this? I feel it should be relatively simple, but just can’t wrap my head around it just now.
PS: The code I used to generate the checkerboard:
noise = torch.zeros((7, 7, 3, 32, 32)) for i in range(len(noise)): for j in range(len(noise[i])): if (i % 2 == 0 and j % 2 != 0) or (i % 2 != 0 and j % 2 == 0): noise[i][j] = torch.ones((3, 32, 32))
Advertisement
Answer
I think you need to transpose before reshape:
n,l=2,3 arr=np.zeros((n,n,3,l,l)) for i in range(n): for j in range(n): arr[i,j] = (i+j)%2 out= arr.transpose(2,0,3,1,4).reshape(3,n*l,-1)
Output: