Skip to content
Advertisement

Copy the last seen non empty value of a column based on a condition in most efficient way in Pandas/Python

I need to copy and paste the previous non-empty value of a column based on a condition. I need to do it in the most efficient way because the number of rows is a couple of millions. Using for loop will be computationally costly.

So it will be highly appreciated if somebody can help me in this regard.

|Col_A   |Col_B   |
|--------|--------|
|10.2.6.1| NaN    |
|  NaN   | 51     |
|  NaN   | NaN    |
|10.2.6.1| NaN    |
|  NaN   | 64     |
|  NaN   | NaN    |
|  NaN   | NaN    |
|10.2.6.1| NaN    |

Based on the condition, whenever the Col_A will have any value (not null) 10.2.6.1 in this example, the last seen value in Col_B (51...64 respectively) will be paste on that corresponding row where the Col_A value is not null. And the dataset should look like this:

|Col_A   |Col_B   |
|--------|--------|
|10.2.6.1| NaN    |
|  NaN   | 51     |
|  NaN   | NaN    |
|10.2.6.1| 51     |
|  NaN   | 64     |
|  NaN   | NaN    |
|  NaN   | NaN    |
|10.2.6.1| 64     |

I tried with this code below but it’s not working:

df.loc[df["Col_A"].notnull(),'Col_B'] = df.loc[df["Col_B"].notnull(),'Col_B']

Advertisement

Answer

You can forward-fill the NaN values using ffill with the most recent non-NaN value.

If you want to keep the NaNs in Col_B then simply create a new column (Col_C) as follows:

df['Col_C'] = df['Col_B'].ffill()

Then replace the value in Col_B where Col_A has a value:

df.loc[df['Col_A'].notnull(), 'Col_B'] = df.loc[df['Col_A'].notnull(), 'Col_C']
df = df.drop(columns=['Col_C'])

Result:

       Col_A    Col_B
0   10.2.6.1      NaN
1        NaN     51.0
2        NaN      NaN
3   10.2.6.1     51.0
4        NaN     64.0
5        NaN      NaN
6        NaN      NaN
7   10.2.6.1     64.0

The above can be simplified if you do not need to keep all NaN rows. For example, it’s possible to do:

df['Col_B'] = df['Col_B'].ffill()
df = df.dropna()

Result:

       Col_A    Col_B
3   10.2.6.1     51.0
7   10.2.6.1     64.0
User contributions licensed under: CC BY-SA
7 People found this is helpful
Advertisement