I have two dataframes colored by approximately matching marks:
The “marks” are not the same in each of them, but some are close. How can I copy the “Evaluated” value from df2 to df1 based on relevant “name” and “mark”?
My code is:
df1 = pd.DataFrame({'Name': ['Lisa', 'Lisa', 'Lisa', 'Hann', 'Hann', 'Hann'], 'Marks': [25.123, 26.425, 27.456, 25.789, 26.124, 26.225], 'Evaluated':['','','','','','']}) df2 = pd.DataFrame({'Name':['Lisa', 'Lisa', 'Lisa', 'Lisa', 'Hann', 'Hann', 'Hann', 'Hann'], 'Marks':[25.125, 26.422, 27.451, 27.465, 25.786, 25.796, 26.121, 26.227], 'Evaluated':[0, 0, 1, 1, 1, 1, 1, 1]}) df3 = pd.merge(df1.round(2), df2.round(2), how='left', on=['Name', 'Marks'])
Expected result is df3
How can I do an approximate match and get the value of the last column? I tried to use df.loc and df.where but they didn’t work because tables are in different shapes. What I expect is similar function to Excel’s Vlookup function where approximation is True. My code changes the values at the end, which I would love to keep as it was in df1. Probably I could make a copy from what I had before, but I believe there is a more pythonic way to solve it, rather than merging the tables.
Thanks in advance!
Advertisement
Answer
You can try pandas.merge_asof
df1 = df1.sort_values(['Marks']) df2 = df2.sort_values(['Marks']) df3 = pd.merge_asof(df1[['Name', 'Marks']], df2, on='Marks', direction='nearest', by='Name')
print(df3) Name Marks Evaluated 0 Lisa 25.123 0 1 Hann 25.789 1 2 Hann 26.124 1 3 Hann 26.225 1 4 Lisa 26.425 0 5 Lisa 27.456 1