Skip to content
Advertisement

Filter and apply multiple conditions between multiple rows

I have the following dataframe:

id  location     method
1      456        Phone
1      456        OS
6      456        OS
6      943        Specialist

What I’m trying to do, is to implement the following logic:

  • If there’s only one record (consider the combination of location + method), I’ll just do nothing. That’s the scenario for the first and last row.
  • If there’s more than one record (location + method), I want to keep only those where the ID == 1.

So, the resulting dataframe would be:

id  location     method
1      456        Phone
1      456        OS
6      943        Specialist

If I’m trying to only filter by the id column, I have this solution: df.loc[df['id'].eq(1).groupby(df['location'], sort=False).idxmax()] (Reference: Filter and apply condition between multiple rows)

But I can not figure out how to perform this filter combined with the “method” column. Any ideas?

Advertisement

Answer

A possible solution:

(df.sort_values(by='id')
 .groupby(['location', 'method']).first()
 .reset_index().sort_index(axis=1))

Output:

   id  location      method
0   1       456          OS
1   1       456       Phone
2   6       943  Specialist
User contributions licensed under: CC BY-SA
2 People found this is helpful
Advertisement