Skip to content
Advertisement

Pandas: replacing nan values conditionally within a group

I have a dataframe with missing values. for each index in a column group, i want to replace these values seperately. If all of the values in a group are missing, i want to replace the values with 1. If only some of the values are missing, i want to replace it with data from an imputed dataframe

dataframe 1

index d0_1 d0_2 d1_1 d1_2
group d0 d0 d1 d1
1 3 3 NaN NaN
2 3 NaN 3 3

dataframe 2 (the imputed one)

index d0_1 d0_2 d1_1 d1_2
group d0 d0 d1 d1
1 3 3 2 2
2 3 2 3 3

output:

index d0_1 d0_2 d1_1 d1_2
group d0 d0 d1 d1
1 3 3 1 1
2 3 2 3 3

my data is much larger and the groups are larger as well.

Ive been struggling now with this for days, i just cant seem to find a working solution

my current solution is iterating over all the groups, and using groupby.transform to replace values, but i dont know how to tell the lambda function to take the values from my second data frame, and my current lambda function also doesnt replace all the values with 1 either, instead just returning the old groups with no changes

JavaScript

Advertisement

Answer

IIUC:

JavaScript

This is assuming the columns are indeed a MultiIndex as you describe, e.g.:

JavaScript

(I initially spent a few minutes making such columns as I thought you had simple Index columns and a first row with the 'group'…)

User contributions licensed under: CC BY-SA
6 People found this is helpful
Advertisement