Skip to content
Advertisement

Remove rows in a group by until the last row meets some condition

I have the following df

JavaScript

We can assume that this data is already sorted. What i need to do is, for every id, I need to remove rows under the following conditions

  1. the first entry for every id is type A
  2. the last entry for every id is type B
  3. the last entry’s B is the last one that appears (data is already sorted)

I’ve accomplished 1. with the following:

df = df.groupby('id').filter(lambda x: x['Type'].iloc[0] != 'A')

Which removes ids entirely if their first type isn’t A

However, for 2. and 3., I don’t want to remove the id if the last type isn’t B, instead I just want to remove everything in the middle

Resulting df:

JavaScript

example code:

JavaScript

Advertisement

Answer

It seems you could use drop_duplicates with different rule depending on type:

JavaScript

Output:

JavaScript
User contributions licensed under: CC BY-SA
6 People found this is helpful
Advertisement