One of the solutions that is similar is found in here where the asker only have a single dataframe and their requirements was to match a fixed string value:
result = df.loc[(df['Col1'] =='Team2') & (df['Col2']=='Medium'), 'Col3'].values[0]
However, the problem I encountered with the .loc
method is that it requires the 2 dataframes to have the same size because it will only match values on the same row position of each dataframe. So if the orders of the rows are mixed in either of the dataframes, it will not work as expected.
Sample of this situation is shown below:
df1 – df1 = pd.DataFrame({'a': [1,2,3], 'b': [4,5,6]})
df2 – df2 = pd.DataFrame({'a': [1, 3, 2], 'b': [4, 6, 5]})
Using, df1.loc[(df1['a'] == df2['a']) & (df1['b'] == df2['b']), 'Status'] = 'Included'
will yield:
But i’m looking for something like this:
I have looked into methods such as .lookup
but is deprecated as of December of the year 2020 (which also requires similar sized dataframes).
Advertisement
Answer
Use DataFrame.merge
with indicator
parameter for new column with this information, if need change values e.g. use numpy.where
:
df = df1.merge(df2, indicator='status', how='left') df['status'] = np.where(df['status'].eq('both'), 'included', 'not included') print (df) a b status 0 1 4 included 1 2 5 included 2 3 6 included