Skip to content
Advertisement

Pandas .transform() results in NaN values after update to newer version

I have some code that used to function ~3-4 years ago. I’ve upgraded to newer versions of pandas, numpy, python since then and it has broken. I’ve isolated what I believe is the issue, but don’t quite understand why it occurs.

JavaScript

Problem: the last line “dc” is a pandas.Series with only NaN values. It should have no NaN values.

Relevant information — the gb object is correct and has no NaN or null values. Also, when I print out the “L” in the function, or the “return” in the function, I get the correct values. However, it’s lost somewhere in the “dc” line. When I swap ‘transform’ to ‘apply’ I get the correct values out of ‘dc’ but the object has duplicate column labels that make it unusable.

Thanks!

EDIT:

Below is some minimal code I spun up to produce the error.

JavaScript

Advertisement

Answer

The cause of the NaNs is that your function outputs a DataFrame/Series with different indices, thus causing reindexing to NaNs.

You can return a numpy array in your function:

JavaScript

output:

JavaScript
User contributions licensed under: CC BY-SA
9 People found this is helpful
Advertisement