Skip to content
Advertisement

Python memory release when using a for loop inside a class

I have some troubles after I created a class to process raster images. The class includes different methods for checking DBs and processing the images.

The usage script is super simple:

from hidroclabc import HidroCLVariable, mod13q1extractor

ndvi = HidroCLVariable('ndvi', some_db)
evi = HidroCLVariable('evi', some_db)
nbr = HidroCLVariable('nbr', some_db)

modext = mod13q1extractor(ndvi,evi,nbr)

modext.run_extraction()

The method run_extraction() is the following:

for scene in scenes_to_process:
                    if scene not in self.ndvi.indatabase:
                        print(f'Processing scene {scene} for ndvi')
                        r = re.compile('.*'+scene+'.*')
                        selected_files = list(filter(r.match, scenes_path))
                        start = time.time()
                        file_date = datetime.strptime(scene, 'A%Y%j').strftime('%Y-%m-%d')
                        mos = mosaic_raster(selected_files,'250m 16 days NDVI')
                        mos = mos * 0.1
                        temporal_raster = os.path.join(tempfolder,'ndvi_'+scene+'.tif')
                        result_file = os.path.join(tempfolder,'ndvi_'+scene+'.csv')
                        mos.rio.to_raster(temporal_raster, compress='LZW')
                        run_WeightedMeanExtraction(temporal_raster,result_file)
                        write_line(self.ndvi.database, result_file, self.ndvi.catchment_names, scene, file_date, nrow = 1)
                        end = time.time()
                        time_dif = str(round(end - start))
                        currenttime = time.strftime("%Y-%m-%d %H:%M:%S", time.localtime())
                        print(f'Time elapsed for {scene}: {str(round(end - start))} seconds')
                        write_log(hcl.log_veg_o_modis_ndvi_mean,scene,currenttime,time_dif,self.ndvi.database)
                        os.remove(temporal_raster)
                        os.remove(result_file)

The method does several steps for getting an observation for a given variable. The code works, but it doesn’t release memory. Since it’s a loop, with every iteration the used memory increases:

enter image description here

When I close the terminal window executing this process, the memory used drops significantly:

enter image description here

This is happening with a Linux Ubuntu Server LTR 22. When I run the same code in. my laptop (macOS), the memory usage is quite different (with half of server’s RAM):

enter image description here

This didn’t happen with functional programming approach, the memory crashed when I placed the for loop inside a class.

How can I improve the memory management of my class?

Advertisement

Answer

Super simple fix. Cleaning the garbage collector at the end of the loop in the class’ method:

import gc

#code

for scene in scenes_to_process:
                    if scene not in self.ndvi.indatabase:
                        # code
                        gc.collect()

Now the memory usage looks beautiful:

enter image description here

User contributions licensed under: CC BY-SA
2 People found this is helpful
Advertisement