Skip to content
Advertisement

Copy the last seen non empty value of a column based on a condition in most efficient way in Pandas/Python

I need to copy and paste the previous non-empty value of a column based on a condition. I need to do it in the most efficient way because the number of rows is a couple of millions. Using for loop will be computationally costly.

So it will be highly appreciated if somebody can help me in this regard.

JavaScript

Based on the condition, whenever the Col_A will have any value (not null) 10.2.6.1 in this example, the last seen value in Col_B (51...64 respectively) will be paste on that corresponding row where the Col_A value is not null. And the dataset should look like this:

JavaScript

I tried with this code below but it’s not working:

JavaScript

Advertisement

Answer

You can forward-fill the NaN values using ffill with the most recent non-NaN value.

If you want to keep the NaNs in Col_B then simply create a new column (Col_C) as follows:

JavaScript

Then replace the value in Col_B where Col_A has a value:

JavaScript

Result:

JavaScript

The above can be simplified if you do not need to keep all NaN rows. For example, it’s possible to do:

JavaScript

Result:

JavaScript
User contributions licensed under: CC BY-SA
7 People found this is helpful
Advertisement