I have two dataframes like this:
df1:
Value | Responsible |
---|---|
11000 | Jack |
21040 | Dylan |
12050 | Jack |
df2:
Start | End |
---|---|
10001 | 20000 |
20001 | 30000 |
Desired output:
Start | End | Responsible |
---|---|---|
10001 | 20000 | Jack |
20001 | 30000 | Dylan |
I need to join the ‘Responsible’ column in df2, using ‘Value’ as key that is in the range set by ‘Start’ and ‘End’. Let’s define that ‘Value’ has unique ‘Responsible’ for each range, but there are repeated values of ‘Responsibles’ in df1 (again, just one for each range).
Is there an easy way to do that?
Advertisement
Answer
First find the smallest Value
that is larger than Start
, then make sure it is smaller than End
:
import pandas as pd df1 = pd.DataFrame({'Value':[11000,21040,12050], 'Responsible':['Jack', 'Dylan', 'Jack']}) df2 = pd.DataFrame({'Start':[10001,20001], 'End':[20000, 30000]}) df = pd.merge_asof(df2.sort_values('Start'), df1.sort_values('Value'), left_on = 'Start', right_on = 'Value', direction='forward') df = df[df['Value']<df['End']].drop(columns = 'Value')
Start End Responsible 0 10001 20000 Jack 1 20001 30000 Dylan