Skip to content
Advertisement

Efficient way to extract data from NETCDF files

I have a number of coordinates (roughly 20000) for which I need to extract data from a number of NetCDF files each comes roughly with 30000 timesteps (future climate scenarios). Using the solution here is not efficient and the reason is the time spent at each i,j to convert “dsloc” to “dataframe” (look at the code below). ** an example NetCDF file could be download from here **

JavaScript

which results is:

JavaScript

which means each i,j takes around 9 seconds to process. Given lots of coordinates and netcdf files with large timesteps, I wonder if there a pythonic way that the code could be optimized. I could also use CDO and NCO operators but I found a similar issue using them too.

Advertisement

Answer

This is a perfect use case for xarray’s advanced indexing using a DataArray index.

JavaScript

Other dimensions in the data will be ignored, so if your original data has dimensions (lat, lon, z, time) your new data would have dimensions (stid, z, time).

Advertisement