Skip to content
Advertisement

Replacing NaN values in a DataFrame row with values from other rows based on a (non-unique) column value

I have a DataFrame similar to the following where I have a column with a non-unique value (in this case address) as well as some other columns containing information about it.

JavaScript

Some of the addresses appear more than once in the DataFrame and some of those repeated ones are missing information. If a certain row is missing the values, but that address appears in another row in the DataFrame, I’d like to replace the NaN values with those from the same address to get something like this:

JavaScript

Using something like a dictionary would be infeasible since the DataFrame contains thousands of different addresses.

EDIT: It’s safe to assume that either both values are missing or both are present. In other words, there will never be a row with only val and not val2 or vice-versa. However, an answer that could take that possible circumstance into account would be even better!

Advertisement

Answer

number of ways you can do this, the most easiest is groupby and ffill / bfill the groups.

JavaScript

Another, and more performant method would be using update along your axis.

JavaScript
User contributions licensed under: CC BY-SA
8 People found this is helpful
Advertisement