CUDA Error: out of memory – Python process utilizes all GPU memory

Question

Even after rebooting the machine, there is >95% of GPU Memory used by python3 process (system-wide interpreter). Note that memory consumption keeps even if there are no running training scripts, and I've never used keras/tensorflow in the system environment, only with venv or in docker container. UPDATED: The last activity was the execution of NN test script with the following

Accepted Answer

By default Tf allocates GPU memory for the lifetime of a process, not the lifetime of the session object (so memory can linger much longer than the object). That is why memory is lingering after you stop the program. In a lot of cases, using the gpu_options.allow_growth = True parameter is flexible, but it will allocate as much GPU memory needed as the runtime process requires.To prevent tf.Session from using all of your GPU memory, you can allocate a fixed amount of memory for the total process by changing your gpu_options.allow_growth = True to allow for a defined memory fraction (let&#8217;s use 50% since your program seems to be able to use a lot of memory) at runtime like:session_conf.gpu_options.per_process_gpu_memory_fraction = 0.5This should stop you from reaching the upper limit and cap at ~2GB (since it looks like you have 4GB of GPU).

Advertisement

Answer