How to prevent data from being recycled when using pd.merge_asof in Python

Question

I am looking to join two data frames using the pd.merge_asof function. This function allows me to match data on a unique id and/or a nearest key. In this example, I am matching on the id as well as the nearest date that is less than or equal to the date in df1. Is there a way to prevent the

Accepted Answer

Given your merge direction is backward, you can do a mask on duplicated id and df2&#8217;s date after merge_asof:out = pd.merge_asof(df1,              df2.rename(columns={'date':'date1'}),    # rename df2's date              left_on='date',              right_on='date1',                        # so we can work on it later              by='id',              direction='backward',              allow_exact_matches=True)# mask the valueout['value'] = out['value'].mask(out.duplicated(['id','date1']))# equivalently# out.loc[out.duplicated(['id', 'date1']), 'value'] = np.nanOutput:        date id      date1 value0 2020-01-02  a 2020-01-01     11 2020-02-02  a 2020-01-01   NaN2 2020-03-02  a 2020-01-01   NaN

Advertisement

Answer