I have some troubles after I created a class to process raster images. The class includes different methods for checking DBs and processing the images.
The usage script is super simple:
from hidroclabc import HidroCLVariable, mod13q1extractor ndvi = HidroCLVariable('ndvi', some_db) evi = HidroCLVariable('evi', some_db) nbr = HidroCLVariable('nbr', some_db) modext = mod13q1extractor(ndvi,evi,nbr) modext.run_extraction()
The method run_extraction()
is the following:
for scene in scenes_to_process: if scene not in self.ndvi.indatabase: print(f'Processing scene {scene} for ndvi') r = re.compile('.*'+scene+'.*') selected_files = list(filter(r.match, scenes_path)) start = time.time() file_date = datetime.strptime(scene, 'A%Y%j').strftime('%Y-%m-%d') mos = mosaic_raster(selected_files,'250m 16 days NDVI') mos = mos * 0.1 temporal_raster = os.path.join(tempfolder,'ndvi_'+scene+'.tif') result_file = os.path.join(tempfolder,'ndvi_'+scene+'.csv') mos.rio.to_raster(temporal_raster, compress='LZW') run_WeightedMeanExtraction(temporal_raster,result_file) write_line(self.ndvi.database, result_file, self.ndvi.catchment_names, scene, file_date, nrow = 1) end = time.time() time_dif = str(round(end - start)) currenttime = time.strftime("%Y-%m-%d %H:%M:%S", time.localtime()) print(f'Time elapsed for {scene}: {str(round(end - start))} seconds') write_log(hcl.log_veg_o_modis_ndvi_mean,scene,currenttime,time_dif,self.ndvi.database) os.remove(temporal_raster) os.remove(result_file)
The method does several steps for getting an observation for a given variable. The code works, but it doesn’t release memory. Since it’s a loop, with every iteration the used memory increases:
When I close the terminal window executing this process, the memory used drops significantly:
This is happening with a Linux Ubuntu Server LTR 22. When I run the same code in. my laptop (macOS), the memory usage is quite different (with half of server’s RAM):
This didn’t happen with functional programming approach, the memory crashed when I placed the for loop inside a class.
How can I improve the memory management of my class?
Advertisement
Answer
Super simple fix. Cleaning the garbage collector at the end of the loop in the class’ method:
import gc #code for scene in scenes_to_process: if scene not in self.ndvi.indatabase: # code gc.collect()
Now the memory usage looks beautiful: