I have two dataframes which are:
Value Date 2010-06-29 3 2010-06-30 1 2010-07-01 5 2010-07-02 4 2010-07-03 9 2010-07-04 7 2010-07-05 2 2010-07-06 3
Value Date 2010-06-29 6 2010-07-03 1 2010-07-06 4
The first dataframe could be created with the Python code:
import pandas as pd df = pd.DataFrame( { 'Date': ['2010-06-29', '2010-06-30', '2010-07-01', '2010-07-02', '2010-07-03', '2010-07-04', '2010-07-05', '2010-07-06'], 'Value': [3, 1, 5, 4, 9, 7, 2, 3] } ) df['Date'] = pd.to_datetime(df['Date']).dt.date df = df.set_index('Date')
and the second dataframe:
df2 = pd.DataFrame( { 'Date': ['2010-06-29', '2010-07-03', '2010-07-06'], 'Value': [6, 1, 4] } ) df2['Date'] = pd.to_datetime(df2['Date']).dt.date df2 = df2.set_index('Date')
I want to create a second column in the first dataframe and the value of each Date in the new column will be the value of the first Date in the second dataframe equal to or earlier than the Date in the first dataframe.
So, the output is:
Value Value_2 Date 2010-06-29 3 6 2010-06-30 1 6 2010-07-01 5 6 2010-07-02 4 6 2010-07-03 9 1 2010-07-04 7 1 2010-07-05 2 1 2010-07-06 3 4
Also, it is my priority not to use any for-loops for the code.
How can I do this?
Advertisement
Answer
pd.merge_asof
should suffice for this
df.index = pd.to_datetime(df.index) df2.index = pd.to_datetime(df2.index) pd.merge_asof(df, df2, on='Date') Date Value_x Value_y 0 2010-06-29 3 6 1 2010-06-30 1 6 2 2010-07-01 5 6 3 2010-07-02 4 6 4 2010-07-03 9 1 5 2010-07-04 7 1 6 2010-07-05 2 1 7 2010-07-06 3 4