Skip to content
Advertisement

Get values from dataframe with MultiIndex index containg NaNs

I cannot access the values of an index position that has a nan in it and wonder how I could solve this. (In my project this index has a very special meaning and I really need to keep it, otherwise I would need to make some dirty manual modifications: “there is always a solution” even if it is a very bad one).

JavaScript

Now I want to access the [3, 7] values as df.loc[(0, np.nan)] and obtain the KeyError: (0, nan) error.

Just to put it in perspective: [df.loc[idx] for idx in df.index if not pd.isna(idx[1])] works properly because I am skipping the problematic index.

What am I missing and how could I solve this?

(Windows 10, python 3.8.5, pandas 1.3.1, numpy 1.20.3, reported to pandas here)

Advertisement

Answer

Update

I am able to reproduce your error after grouping and aggregating a data frame.

JavaScript

Passing in an explit MultiIndex works, though.

JavaScript

And so does returning a data frame using a single tuple. Note using [[]] returns a DataFrame.

JavaScript

As does DataFrame.reindex (see also the user guide on reindexing).

JavaScript

Original Attempt to Reproduce Error

I am not able to reproduce your error. You can see below that using df.loc[(0, np.nan)] works.

JavaScript

Then I noticed that your index was printed as (0, nan) but mine was (0, np.nan). The difference was that I used np.nan and I suspect yours is pd.NA.

JavaScript

However, that did not resolve the difference. I was still able to use df.loc[(0, np.nan)].

JavaScript

Moreover, I was also able to use df.loc[(0, None)].

JavaScript

Just to confirm, np.nan, pd.NA, and None are all different objects. Pandas must treat them the same when used with DataFrame.loc.

JavaScript
User contributions licensed under: CC BY-SA
3 People found this is helpful
Advertisement