python pandas dataframe : fill nans with a conditional mean of previous and next value

Question

I have the following dataframe: And I want value NaN to be filled with the conditional mean of previous and next value based on the same column. Just like this, value 6 is the mean with 5 and 7. And this is a little part of my dataframe, so I need to replace all the NaN. Answer EDIT: For replace

Accepted Answer

EDIT:For replace missing values in all columns use:df = df.bfill().add(df.ffill()).div(2)If need repalce only some columns, e.g. numeric:cols = df.select_dtypes(np.number).columnsdf[cols] = df[cols].bfill().add(df[cols].ffill()).div(2)Use:df = pd.DataFrame({'col':[1,15.6,np.nan, np.nan, 15.8,5,                           np.nan, 4,10, np.nan, np.nan,np.nan, 7]})#filter non missing valuesm = df['col'].notna()#count 2 consecutive NaNsm = df.groupby(m.cumsum()[~m])['col'].transform('size').eq(2)#expand mask to previous and next values for consecutive 2 NaNsmask = m.shift(fill_value=False) | m.shift(-1, fill_value=False)print (mask)0     False1      True2      True3      True4      True5     False6     False7     False8     False9     False10    False11    False12    FalseName: col, dtype: bool#for filtered rows create meansdf.loc[mask, 'col'] = df.loc[mask, 'col'].bfill().add(df.loc[mask, 'col'].ffill()).div(2)print (df)     col0    1.01   15.62   15.73   15.74   15.85    5.06    NaN7    4.08   10.09    NaN10   NaN11   NaN12   7.0If need means for all missing values remove mask:df['col'] = df['col'].bfill().add(df['col'].ffill()).div(2)print (df)     col0    1.01   15.62   15.73   15.74   15.85    5.06    4.57    4.08   10.09    8.510   8.511   8.512   7.0

Advertisement

Answer