Skip to content
Advertisement

efficient way to find the most recent entry in another dataframe for each entry of a dataframe indexed by datetime in pandas

I have two dataframes, and both of them are indexed by datetime. for example, the dataframe 1 is something below:

JavaScript

and the dataframe 2 looks like:

JavaScript

For each entry in dataframe 1, I want to find the most recent one entry in dataframe 2, and create a new column in dataframe 1 to setup the relationship between the two dataframes.

To make it more clearly, the expected results should look like below.

JavaScript

For the first entry in dataframe 1, 2021-11-11 09:00‘s most recent one is 2021-11-10 11:00, and the third entry 2021-11-12 11:00‘s most recent one which means the largest timestamp that do not exceed 2021-11-12 11:00 in dataframe 2 is the 2021-11-11 09:30.

Is there any pandas method that could implement this function efficiently?

Great thanks.

Advertisement

Answer

pandas merge_asof should suffice :

JavaScript
User contributions licensed under: CC BY-SA
1 People found this is helpful
Advertisement