Following image showing Memory Usage:
Memory error occurs. I am using Numpy and Python3. I have two numpy arrays of shape (36000,256,256,3) each as X and Y and memory error occurs when I do following code. They are code to prepare training data. Is there another way to do it which uses lesser memory?
This is my processor: Intel® Xeon(R) CPU E5-2620 v4 @ 2.10GHz × 32
The error is shown in : X, Y = shuffle(X,Y)
X = []
Y = []
for im , normal in zip(images,normals) :
image = getImageArr(dir_resize_mRGB + im , 256 , 256 )
X.append(image)
Y.append( getNormalArr( dir_resize_mNormal + normal , 256 , 256 ) )
X, Y = np.array(X) , np.array(Y)
print(X.shape)
X_min = np.min(X)
X_max = np.max(X)
X = (X-X_min)/(X_max-X_min)
print('min:{}, max:{}'.format(X_min, X_max))
train_rate = 0.85
np.random.seed(42)
index_train = np.random.choice(X.shape[0],int(X.shape[0]*train_rate),replace=False)
index_test = list(set(range(X.shape[0])) - set(index_train))
X, Y = shuffle(X,Y)
X_train, y_train = X[index_train],Y[index_train]
X_test, y_test = X[index_test],Y[index_test]
Traceback (most recent call last):
File "our_train_normal.py", line 312, in <module>
X, Y = shuffle(X,Y)
File "/home/ivlab/anaconda2/envs/tuto/lib/python3.7/site-packages/sklearn/utils/__init__.py", line 403, in shuffle
return resample(*arrays, **options)
File "/home/ivlab/anaconda2/envs/tuto/lib/python3.7/site-packages/sklearn/utils/__init__.py", line 327, in resample
resampled_arrays = [safe_indexing(a, indices) for a in arrays]
File "/home/ivlab/anaconda2/envs/tuto/lib/python3.7/site-packages/sklearn/utils/__init__.py", line 327, in <listcomp>
resampled_arrays = [safe_indexing(a, indices) for a in arrays]
File "/home/ivlab/anaconda2/envs/tuto/lib/python3.7/site-packages/sklearn/utils/__init__.py", line 216, in safe_indexing
return X.take(indices, axis=0)
MemoryError
Advertisement
Answer
It is not clear if this is a custom shuffle
function or the numpy.random.shuffle
which seems to take in only one array.
If you are running into Out Of Memory
error, you should first try sub-sampling your arrays, like X = X[100, :]
and Y = Y[100, :]
, and verify that this is indeed due to exceeding memory.
In order to shuffle two arrays by the same order, I will suggest using numpy.random.permutation
which will give you a list of indices.
shuff_indx = numpy.random.permutation(X.shape[0])
X = X[shuff_indx, :]
Y = Y[shuff_indx, :]