These are the data I want to convert which are saved in CSV. And some of the longitude and latitude may are repeated, actually, they are extracted from a NetCDF file.
lon Out[56]: 0 121.25 1 121.75 2 122.25 3 122.75 4 123.25 3819 109.75 3820 110.25 3821 108.75 3822 109.25 3823 109.75 Name: E, Length: 3824, dtype: float64 lat Out[57]: 0 53.25 1 53.25 2 53.25 3 53.25 4 53.25 3819 19.25 3820 19.25 3821 18.75 3822 18.75 3823 18.75 Name: N, Length: 3824, dtype: float64 pr Out[58]: 0 136.094444 1 95.242593 2 120.557407 3 92.844444 4 106.596296 3819 176.818519 3820 512.942593 3821 271.687037 3822 359.205556 3823 242.946296 Name: annual, Length: 3824, dtype: float64
So I want to convert them to xarray because I need the ‘pr’ to be 2D(with no repeated long or lat) like the following one.
<xarray.DataArray 'Temperature_surface' (lat: 153, lon: 257)> array([[258.67383, 258.57382, 258.87384, ..., 249.67383, 246.57382, 244.97383], [258.57382, 258.77383, 258.67383, ..., 245.27383, 246.77383, 251.47383], [258.57382, 258.47382, 258.27383, ..., 246.67383, 246.07382, 251.47383], ..., [300.77383, 300.77383, 300.67383, ..., 302.37384, 302.27383, 302.27383], [300.87384, 300.77383, 300.67383, ..., 302.37384, 302.37384, 302.27383], [300.87384, 300.97382, 300.97382, ..., 302.37384, 302.37384, 302.27383]], dtype=float32) Coordinates: * lat (lat) float32 56.0 55.75 55.5 55.25 55.0 ... 18.75 18.5 18.25 18.0 * lon (lon) float32 72.0 72.25 72.5 72.75 ... 135.2 135.5 135.8 136.0
Here is my code:
import pandas as pd data=pd.read_csv('E:DesktopData ProcessingCorrect NewCSVChina_R95P.csv') lon=data['E'] lat=data['N'] pr=data['annual'] df=pd.DataFrame({ 'lon':lon, 'lat':lat, 'pr':pr }) df=df.set_index(['lon','lat'])
df is like this
Out[97]: pr lon lat 121.25 53.25 136.094444 121.75 53.25 95.242593 122.25 53.25 120.557407 122.75 53.25 92.844444 123.25 53.25 106.596296 ... 109.75 19.25 176.818519 110.25 19.25 512.942593 108.75 18.75 271.687037 109.25 18.75 359.205556 109.75 18.75 242.946296 [3824 rows x 1 columns]
And then when I use
df.to_xarray()
I got the errorValueError: cannot convert a DataFrame with a non-unique MultiIndex into xarray
What should I do ? Thanks for answering!
Advertisement
Answer
As your error says, you have a non-unique index. This causes a problem in xarray because you are potentially sending contradictory data to it. Each longitude and latitude should have a unique value. So you either need to drop duplicates, or average the values in each lon/lat. The following will work if you simply have duplicates:
df=df.drop_duplicates.reset_index(drop=True)