Skip to content
Advertisement

Groupby with multiindex replacement

For a given date, how can I replace missing UK values with the US value? Need in general code.

df
                              value
country        date         
uk            2016-01-01       NaN
              2016-01-08       5.0
us            2016-01-01       21.0
              2016-01-08       1.5
don't touch   2016-01-01       44.0
              2016-01-08       44.5

Expected Output

                              value
country          date        
uk             2016-01-01       21.0     
               2016-01-08       5.0
us             2016-01-01       21.0
               2016-01-08       1.5
don't touch   2016-01-01        44.0
              2016-01-08        44.5

Advertisement

Answer

You can select by lists for avoid remove MultiIndex and for correct align is used rename:

df.loc[['uk']] = df.loc[['uk']].fillna(df.loc[['us']].rename({'us':'uk'}))
print (df)
                        value
country     date             
uk          2016-01-01   21.0
            2016-01-08    5.0
us          2016-01-01   21.0
            2016-01-08    1.5
don't touch 2016-01-01   44.0
            2016-01-08   44.5

Or reshape by DataFrame.unstack, replace by rows and reshape back:

df = df.unstack()
df.loc['uk'] = df.loc['uk'].fillna(df.loc['us'])
df = df.stack()

print (df)
                        value
country     date             
don't touch 2016-01-01   44.0
            2016-01-08   44.5
uk          2016-01-01   21.0
            2016-01-08    5.0
us          2016-01-01   21.0
            2016-01-08    1.5

EDIT: If want use solution only of all NaNs solution is test it for scalar True or False:

print (df)
                       value
country     date            
uk          2016-01-01   NaN
            2016-01-08   NAN
us          2016-01-01  21.0
            2016-01-08   1.5
don't touch 2016-01-01  44.0
            2016-01-08  44.5

df = df.unstack()
print (df)
                 value           
date        2016-01-01 2016-01-08
country                          
don't touch       44.0       44.5
uk                 NaN        NAN
us                21.0        1.5

test = df.loc['uk'].isna().all()
print (test)
True

if test:
    df.loc['uk'] = df.loc['uk'].fillna(df.loc['us'])
df = df.stack(dropna=False)
print (df)
                        value
country     date             
don't touch 2016-01-01   44.0
            2016-01-08   44.5
uk          2016-01-01   21.0
            2016-01-08    1.5
us          2016-01-01   21.0
            2016-01-08    1.5
User contributions licensed under: CC BY-SA
10 People found this is helpful
Advertisement