Copy the last seen non empty value of a column based on a condition in most efficient way in Pandas/Python

Question

I need to copy and paste the previous non-empty value of a column based on a condition. I need to do it in the most efficient way because the number of rows is a couple of millions. Using for loop will be computationally costly. So it will be highly appreciated if somebody can help me in this regard. Based on

Accepted Answer

You can forward-fill the NaN values using ffill with the most recent non-NaN value.If you want to keep the NaNs in Col_B then simply create a new column (Col_C) as follows:df['Col_C'] = df['Col_B'].ffill()Then replace the value in Col_B where Col_A has a value:df.loc[df['Col_A'].notnull(), 'Col_B'] = df.loc[df['Col_A'].notnull(), 'Col_C']df = df.drop(columns=['Col_C'])Result:       Col_A    Col_B0   10.2.6.1      NaN1        NaN     51.02        NaN      NaN3   10.2.6.1     51.04        NaN     64.05        NaN      NaN6        NaN      NaN7   10.2.6.1     64.0The above can be simplified if you do not need to keep all NaN rows. For example, it&#8217;s possible to do:df['Col_B'] = df['Col_B'].ffill()df = df.dropna()Result:       Col_A    Col_B3   10.2.6.1     51.07   10.2.6.1     64.0

Advertisement

Answer