Skip to content

How to drop rows in one DataFrame based on one similar column in another Dataframe that has a different number of rows

I have two DataFrames that are completely dissimilar except for certain values in one particular column:

  First  Last   Email             Age
0 Adam   Smith   30
1 John   Brown  35
2 Joe    Max  40
3 Will   Bill  25
4 Johnny Jacks  50
  ID   Location  Contact
0 5435 Austin
1 4234 Atlanta
2 7896 Toronto

How would I go about finding the matching values in the Email column of df and the Contact column of df2, and then dropping the whole row in df based on that match?

Output I’m looking for (index numbering doesn’t matter):

  First  Last   Email             Age
1 John   Brown  35
3 Will   Bill  25

I’ve been able to identify matches using a few different methods like:

Changing the column names to be identical

common = df.merge(df2,on=['Email'])
df3 = df[(~df['Email'].isin(common['Email']))]

But df3 still shows all the rows from df.

I’ve also tried:

common = df['Email'].isin(df2['Contact'])
df.drop(df[common].index, inplace = True)

And again, identifies the matches but df still contains all original rows.

So the main thing I’m having difficulty with is updating df with the matches dropped or creating a new DataFrame that contains only the rows with dissimilar values when comparing the Email column from df and the Contact column in df2. Appreciate any suggestions.



As mentioned in the comments(@Arkadiusz), it is enough to filter your data using the following

df3 = df[(~df['Email'].isin(df2.Contact))].copy()