Skip to content
Advertisement

How to drop duplicates in pandas but keep more than the first

Let’s say I have a pandas DataFrame:

JavaScript

I want to drop duplicates if they exceed a certain threshold n and replace them with that minimum. Let’s say that n=3. Then, my target dataframe is

JavaScript

EDIT: Each set of consecutive repetitions is considered separately. In this example, rows 8 and 9 should be kept.

Advertisement

Answer

You can create unique value for each consecutive group, then use groupby and head:

JavaScript
User contributions licensed under: CC BY-SA
9 People found this is helpful
Advertisement