I have the following dataframe:
id location method 1 456 Phone 1 456 OS 6 456 OS 6 943 Specialist
What I’m trying to do, is to implement the following logic:
- If there’s only one record (consider the combination of location + method), I’ll just do nothing. That’s the scenario for the first and last row.
- If there’s more than one record (location + method), I want to keep only those where the ID == 1.
So, the resulting dataframe would be:
id location method 1 456 Phone 1 456 OS 6 943 Specialist
If I’m trying to only filter by the id column, I have this solution: df.loc[df['id'].eq(1).groupby(df['location'], sort=False).idxmax()]
(Reference: Filter and apply condition between multiple rows)
But I can not figure out how to perform this filter combined with the “method” column. Any ideas?
Advertisement
Answer
A possible solution:
(df.sort_values(by='id') .groupby(['location', 'method']).first() .reset_index().sort_index(axis=1))
Output:
id location method 0 1 456 OS 1 1 456 Phone 2 6 943 Specialist