Skip to content
Advertisement

Compare two dataframe column values and join with condition in python?

I need to join the below dataframe based on some condition.

JavaScript

df_output

JavaScript

I need to join two dataframe df1, df2 based on Id column but every element should be in df.Id list that’s when we consider it a match.

JavaScript

Advertisement

Answer

While this isn’t a highly efficient solution, you can use some sets to solve this problem.

JavaScript

In the above snippet:

  1. matches = df1["Id"].apply(set) <= df2["Id"].apply(set) returns a boolean Series that is True where the contents of each row in df1[‘Id’] is in the corresponding row in df2[‘Id’], and False otherwise
  2. Instead of performing an actual merge we can simply align the 2 DataFrames on the aforementioned boolean Series

If you want to test Ids against eachother in both dataframes, you can take the cartesian product of both DataFrames, filter it down to the inner join via the set criteria, and then append back any missing left join keys.

JavaScript
Advertisement