I have a 3D numpy array. This can be thought of as an image (to be exact it’s values of field points). I want to remove the border (0 values, note that there are negative values possible) in all dimensions. The restriction is that the dimension remains the same for all molecules, eg. I only want to remove the border so far as that the “largest” entry in that dimension is still within the border. So the whole data set (small, size of it is not an issue) needs to be taken into account.
Example in 2D:
0 0 0 0 0 0 1 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 1 0
Here the top row, and left and right most columns should be removed. Over the whole data set, they only contain 0 values.
The result would be below:
1 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 0 1
Since I’m not a numpy expert I’m having trouble defining an algorithm to achieve my need. I will need to find the min and max index in each dimension which is not 0 and then use that to trim the array.
Similar to this but in 3D and the cropping must take into account the whole data set.
How can I achieve this?
UPDATE 13th Feb 2019:
So I have tried 3 answers here (one which seems to have been removed which was using zip),Martins and norok2s answer. The output dimensions are the same so I assume all of them work.
I choose Martins solution because I can easily extract the bounding box to apply it to test set.
UPDATE Feb 25th:
If anyone still is observing this I would like to have further input. As said these aren’t actually images but “field values” meaning float and not greyscale images (uint8) which means I need to use at least float16 and this simply needs too much memory. (I have 48gb available but that’s not enough even for 50% of the training set).
Advertisement
Answer
Try this: – its a main algorithm. I dont understand exactly which sides you want extract from your examples, but the below algorithm should be very easy for you to modify according to your needs
Note: This algorithm extracts CUBE where all zero value borders are ‘deleted’. So on each side of cube is some value != 0
import numpy as np # testing dataset d = np.zeros(shape = [5,5,5]) # fill some values d[3,2,1]=1 d[3,3,1]=1 d[1,3,1]=1 d[1,3,4]=1 # find indexes in all axis xs,ys,zs = np.where(d!=0) # for 4D object # xs,ys,zs,as = np.where(d!=0) # extract cube with extreme limits of where are the values != 0 result = d[min(xs):max(xs)+1,min(ys):max(ys)+1,min(zs):max(zs)+1] # for 4D object # result = d[min(xs):max(xs)+1,min(ys):max(ys)+1,min(zs):max(zs)+1,min(as):max(as)+1] >>> result.shape (3, 2, 4)
Case 1:
d = np.zeros(shape = [5,5,5]) d[3,2,1]=1 # ... just one value >>> result.shape # works (1,1,1)
Case 2: # error case – only zeros – resulting 3D has no dimensions -> error
d = np.zeros(shape = [5,5,5]) # no values except zeros >>> result.shape Traceback (most recent call last): File "C:UserszzzDesktoppy.py", line 7, in <module> result = d[min(xs):max(xs)+1,min(ys):max(ys)+1,min(zs):max(zs)+1] ValueError: min() arg is an empty sequence
EDIT: Because my solution didnt get enough love and understanding, I will provide example to 4th dimensionl body, where 3 Dimensions are free for image and 4th dimension is where images are stored
import numpy as np class ImageContainer(object): def __init__(self,first_image): self.container = np.uint8(np.expand_dims(np.array(first_image), axis=0)) def add_image(self,image): #print(image.shape) temp = np.uint8(np.expand_dims(np.array(image), axis=0)) #print(temp.shape) self.container = np.concatenate((self.container,temp),axis = 0) print('container shape',self.container.shape) # Create image container storage image = np.zeros(shape = [5,5,3]) # some image image[2,2,1]=1 # put something random in it container = ImageContainer(image) image = np.zeros(shape = [5,5,3]) # some image image[2,2,2]=1 container.add_image(image) image = np.zeros(shape = [5,5,3]) # some image image[2,3,0]=1 # if we set [2,2,0] = 1, we can expect all images will have just 1x1 pixel size container.add_image(image) image = np.zeros(shape = [5,5,3]) # some image image[2,2,1]=1 container.add_image(image) >>> container.container.shape ('container shape', (4, 5, 5, 3)) # 4 images, size 5x5, 3 channels # remove borders to all images at once xs,ys,zs,zzs = np.where(container.container!=0) # for 4D object # extract cube with extreme limits of where are the values != 0 result = container.container[min(xs):max(xs)+1,min(ys):max(ys)+1,min(zs):max(zs)+1,min(zzs):max(zzs)+1] >>> print('Final shape:',result.shape) ('Final shape', (4, 1, 2, 3)) # 4 images, size: 1x2, 3 channels