Skip to content
Advertisement

Remove duplicates and keep row that certain column is Yes in a pandas dataframe

I have a dataframe with duplicated values on column “ID”, like this one:

JavaScript

I need a way to remove duplicates (by “ID”) but keep the ones that the column Primary is “Yes” (all unique values have “Yes” in that column and duplicated values have one record as “Yes” and all others as “No”) resulting in this dataframe:

JavaScript

What is the best way to do it?

Thanks!

Advertisement

Answer

Use DataFrame.sort_valuesYes rows are in end of DataFrame, so possible use DataFrame.drop_duplicates with keep='last' – this solution should return Primary?=No if exist some ID without Primary?=Yes values:

JavaScript
User contributions licensed under: CC BY-SA
7 People found this is helpful
Advertisement