I have two dataframes, df1 and df2. df1 is a table of labels set between ranges:
Top Bottom Label ID 0 0.0 2.2 A Z-1 1 2.2 6.6 B Z-1 2 6.6 9.1 C Z-1 3 0.0 1.2 A Z-2 4 1.2 4.5 B Z-2 5 6.6 9.1 C Z-2
Where the ID column relates to a higher order grouping. Both df1 and df2 contain shared ID’s and I want to apply the labels from df1 to df2.
df2 is a table of regularly sampled data:
Samp Var ID 0 0.0 157 Z-1 1 0.5 226 Z-1 2 1.5 843 Z-1 3 2.0 999 Z-1 4 2.5 142 Z-1 5 3.0 167 Z-1 6 0.0 157 Z-2 7 0.5 226 Z-2 8 1.5 111 Z-2 9 2.0 666 Z-3
The desired output would look like this:
Samp Var ID Label 0 0.0 157 Z-1 A 1 0.5 226 Z-1 A 2 1.5 843 Z-1 A 3 2.0 999 Z-1 A 4 2.5 142 Z-1 B 5 3.0 167 Z-1 B 6 0.0 157 Z-2 A 7 0.5 226 Z-2 A 8 1.5 111 Z-2 B 9 2.0 666 Z-3 B
The issue I am having when looking for other answers is that I first need to match the ID’s and then apply the labels within the range between the top and bottom columns. Any help would be appreciated!
Advertisement
Answer
here is one way to do it Z-3 doesn’t exists in your reference table and hence its null.
Idea is to merge on ID and then query the ones where Samp falls within the range, and then its just a cleanup of result set
df2.merge(df, on='ID', how='left', suffixes=("","_y") ).query('(Samp>=Top & Samp<=Bottom) | Label.isna()' ).reset_index().drop( columns=['index','Top','Bottom'] )
Samp Var ID Label 0 0.0 157 Z-1 A 1 0.5 226 Z-1 A 2 1.5 843 Z-1 A 3 2.0 999 Z-1 A 4 2.5 142 Z-1 B 5 3.0 167 Z-1 B 6 0.0 157 Z-2 A 7 0.5 226 Z-2 A 8 1.5 111 Z-2 B 9 2.0 666 Z-3 NaN