Skip to content
Advertisement

Join two dataframes by range and values

I have two dataframes like this:

df1:

Value Responsible
11000 Jack
21040 Dylan
12050 Jack

df2:

Start End
10001 20000
20001 30000

Desired output:

Start End Responsible
10001 20000 Jack
20001 30000 Dylan

I need to join the ‘Responsible’ column in df2, using ‘Value’ as key that is in the range set by ‘Start’ and ‘End’. Let’s define that ‘Value’ has unique ‘Responsible’ for each range, but there are repeated values of ‘Responsibles’ in df1 (again, just one for each range).

Is there an easy way to do that?

Advertisement

Answer

First find the smallest Value that is larger than Start, then make sure it is smaller than End:

import pandas as pd
df1 = pd.DataFrame({'Value':[11000,21040,12050], 'Responsible':['Jack', 'Dylan', 'Jack']})
df2 = pd.DataFrame({'Start':[10001,20001], 'End':[20000, 30000]})

df = pd.merge_asof(df2.sort_values('Start'), df1.sort_values('Value'),
                   left_on = 'Start', right_on = 'Value', direction='forward')
df = df[df['Value']<df['End']].drop(columns = 'Value')
    Start   End     Responsible
0   10001   20000   Jack
1   20001   30000   Dylan
User contributions licensed under: CC BY-SA
4 People found this is helpful
Advertisement