I want to do this in pandas:
I have 2 dataframes, A and B, I want to replace only NaN of A with B values.
JavaScript
x
18
18
1
A
2
2014-04-17 12:59:00 146.06250 146.0625 145.93750 145.93750
3
2014-04-17 13:00:00 145.90625 145.9375 145.87500 145.90625
4
2014-04-17 13:01:00 145.90625 NaN 145.90625 NaN
5
2014-04-17 13:02:00 NaN NaN 145.93750 145.96875
6
7
B
8
2014-04-17 12:59:00 146 2/32 146 2/32 145 30/32 145 30/32
9
2014-04-17 13:00:00 145 29/32 145 30/32 145 28/32 145 29/32
10
2014-04-17 13:01:00 145 29/32 146 145 29/32 147
11
2014-04-17 13:02:00 146 146 145 30/32 145 31/32
12
13
Result:
14
2014-04-17 12:59:00 146.06250 146.0625 145.93750 145.93750
15
2014-04-17 13:00:00 145.90625 145.9375 145.87500 145.90625
16
2014-04-17 13:01:00 145.90625 146 145.90625 147
17
2014-04-17 13:02:00 146 146 145.93750 145.96875
18
Advertisement
Answer
The official way promoted exactly to do this is A.combine_first(B)
. Further information are in the official documentation.
However, it gets outperformed massively with large databases from A.fillna(B)
(performed tests with 25000 elements):
JavaScript
1
7
1
In[891]: %timeit df.fillna(df2)
2
1000 loops, best of 3: 333 µs per loop
3
In[892]: %timeit df.combine_first(df2)
4
100 loops, best of 3: 2.15 ms per loop
5
In[894]: (df.fillna(df2) == df.combine_first(df2)).all().all()
6
Out[890]: True
7