Efficient way to extract data from NETCDF files

Question

I have a number of coordinates (roughly 20000) for which I need to extract data from a number of NetCDF files each comes roughly with 30000 timesteps (future climate scenarios). Using the solution here is not efficient and the reason is the time spent at each i,j to convert &#8220;dsloc&#8221; to &#8220;dataf…

Accepted Answer

This is a perfect use case for xarray&#8217;s advanced indexing using a DataArray index.# Make the index on your coordinates DataFrame the station ID,# then convert to a dataset.# This results in a Dataset with two DataArrays, lat and lon, each# of which are indexed by a single dimension, stidcrd_ix = crd.set_index('stid').to_xarray()# now, select using the arrays, and the data will be re-oriented to have# the data only for the desired pixels, indexed by 'stid'. The# non-indexing coordinates lat and lon will be indexed by (stid) as well.NC.sel(lon=crd_ix.lon, lat=crd_ix.lat, method='nearest')Other dimensions in the data will be ignored, so if your original data has dimensions (lat, lon, z, time) your new data would have dimensions (stid, z, time).

Advertisement

Answer