memory leak while adding elements from text file to dictionary in python

Question

I am running a very simple code to read into txt files and add them to an existing dictionary. With htop I see that the used memory linearly increases until I run out of memory. Here is a simplified version of the code: I tried to delete the output and added a garbage collector in the loop and it has

Accepted Answer

When you call array = output[:,1] the numpy just creates a view. Meaning that it keeps a reference to a whole (presumably large) output and the information that array is just first column. Now you save this reference to the dic meaning there still exists a reference to a whole output and garbage collector cannot free the memory.To work around this issue just instruct numpy that it should create a copy:array = output[:,1].copy()That way array will contain its own copy of the data (which is slower that creating the view), but the point is that once you delete the output (either explicitly via del output or override it in the next iteration), there is no more references to the output and the memory will be freed.

Advertisement

Answer