How to merge two dataframe in pandas to replace nan

I want to do this in pandas:

I have 2 dataframes, A and B, I want to replace only NaN of A with B values.

A                                                
2014-04-17 12:59:00  146.06250  146.0625  145.93750  145.93750
2014-04-17 13:00:00  145.90625  145.9375  145.87500  145.90625
2014-04-17 13:01:00  145.90625       NaN  145.90625        NaN
2014-04-17 13:02:00        NaN       NaN  145.93750  145.96875

B
2014-04-17 12:59:00   146 2/32   146 2/32  145 30/32  145 30/32
2014-04-17 13:00:00  145 29/32  145 30/32  145 28/32  145 29/32
2014-04-17 13:01:00  145 29/32        146  145 29/32        147
2014-04-17 13:02:00        146        146  145 30/32  145 31/32

Result:
2014-04-17 12:59:00  146.06250  146.0625  145.93750  145.93750
2014-04-17 13:00:00  145.90625  145.9375  145.87500  145.90625
2014-04-17 13:01:00  145.90625       146  145.90625        147
2014-04-17 13:02:00        146       146  145.93750  145.96875

JavaScript
​x
 
A                                                
2014-04-17 12:59:00  146.06250  146.0625  145.93750  145.93750
2014-04-17 13:00:00  145.90625  145.9375  145.87500  145.90625
2014-04-17 13:01:00  145.90625       NaN  145.90625        NaN
2014-04-17 13:02:00        NaN       NaN  145.93750  145.96875
​
B
2014-04-17 12:59:00   146 2/32   146 2/32  145 30/32  145 30/32
2014-04-17 13:00:00  145 29/32  145 30/32  145 28/32  145 29/32
2014-04-17 13:01:00  145 29/32        146  145 29/32        147
2014-04-17 13:02:00        146        146  145 30/32  145 31/32
​
Result:
2014-04-17 12:59:00  146.06250  146.0625  145.93750  145.93750
2014-04-17 13:00:00  145.90625  145.9375  145.87500  145.90625
2014-04-17 13:01:00  145.90625       146  145.90625        147
2014-04-17 13:02:00        146       146  145.93750  145.96875
​

Answer

The official way promoted exactly to do this is A.combine_first(B). Further information are in the official documentation.

However, it gets outperformed massively with large databases from A.fillna(B) (performed tests with 25000 elements):

In[891]: %timeit df.fillna(df2)
1000 loops, best of 3: 333 µs per loop
In[892]: %timeit df.combine_first(df2)
100 loops, best of 3: 2.15 ms per loop
In[894]: (df.fillna(df2) == df.combine_first(df2)).all().all()
Out[890]: True

JavaScript
 
In[891]: %timeit df.fillna(df2)
1000 loops, best of 3: 333 µs per loop
In[892]: %timeit df.combine_first(df2)
100 loops, best of 3: 2.15 ms per loop
In[894]: (df.fillna(df2) == df.combine_first(df2)).all().all()
Out[890]: True
​

Advertisement

Answer