Find value smaller but closest to current value

Question

I have a very large pandas dataframe that contains two columns, column A and column B. For each value in column A, I would like to find the largest value in column B that is less than the corresponding value in column A. Note that each value in column B can be mapped to many values in column A. Here's

Accepted Answer

Yes, it is doable using pandas.merge_asof.  Explanation as comments in the code &#8211;import pandas as pddf = pd.DataFrame({'a' : [1, 5, 7, 2, 3, 4], 'b' : [5, 2, 7, 5, 1, 9]})# merge_asof requires the keys to be sortedadf = df[['a']].sort_values(by='a')bdf = df[['b']].sort_values(by='b')# your example wants 'strictly less' so we also add 'allow_exact_matches=False'cdf_ordered = pd.merge_asof(adf, bdf, left_on='a', right_on='b', allow_exact_matches=False, direction='backward')# rename the dataframe |a|b| -> |a|c|cdf_ordered = cdf_ordered.rename(columns={'b': 'c'})# since c is based on sorted a, we merge with original dataframe column anew_df = pd.merge(df, cdf_ordered, on='a')print(new_df)"""   a  b    c0  1  5  NaN1  5  2  2.02  7  7  5.03  2  5  1.04  3  1  2.05  4  9  2.0"""

Advertisement

Answer