Merging two dataframes on timestamp while preserving all data

Question

I want to merge two dataframes to create a single time-series with two variables. I have a function that does this by iterating over each dataframe using itterows()&#8230; which is terribly slow and doesn&#8217;t take advantage of the vectorization that pandas and numpy provide&#8230; Would you be able to hel…

Accepted Answer

This can be broken down into 2 steps:The first step is the equivalent of an outer join in SQL, where create a table containing keys of both source tables. This is done with merge(..., how="outer")The second is filling the NaN with the previous non-NaN values, which can done with ffillz = a.merge(b, on="timestamp", how="outer").sort_values("timestamp").ffill()

Advertisement

Answer