How to delete duplicates pandas

I need to check if there are some duplicates value in one column of a dataframe using Pandas and, if there is any duplicate, delete the entire row. I need to check just the first column.

Example:

object    type

apple     fruit
ball      toy
banana    fruit
xbox      videogame
banana    fruit
apple     fruit

JavaScript
​x
 
object    type
​
apple     fruit
ball      toy
banana    fruit
xbox      videogame
banana    fruit
apple     fruit
​

What i need is:

object    type

apple     fruit
ball      toy
banana    fruit
xbox      videogame

JavaScript
 
object    type
​
apple     fruit
ball      toy
banana    fruit
xbox      videogame
​

I can delete the ‘object’ duplicates with the following code, but I can’t delete the entire row that contains the duplicate as the second column won’t be deleted.

df = pd.read_csv(directory, header=None,)

objects= df[0]

for object in df[0]:

JavaScript
 
df = pd.read_csv(directory, header=None,)
​
objects= df[0]
​
for object in df[0]:
   
​

Answer

Select by duplicated mask and negate it

df = df[~df["object"].duplicated()]

JavaScript
 
df = df[~df["object"].duplicated()]
​

Which gives

   object       type
0   apple      fruit
1    ball        toy
2  banana      fruit
3    xbox  videogame

JavaScript
 
   object       type
0   apple      fruit
1    ball        toy
2  banana      fruit
3    xbox  videogame
​

Advertisement

Answer