I have a df that looks like this:
Name Letter Period Amount 123 H PRE 11 123 H DURING 5 123 H POST 100 456 H PRE 9 456 H DURING 50 456 H POST 600 789 J PRE 8 789 J DURING 9 789 J POST 200
Currently, I am using this line of code to filter on the df so that only rows that are of a period PRE and have an amount of more than 10 are included:
revised_data[ (revised_data['Period'] == 'PRE' ) & (revised_data['Amount'] > 10)]
What I realized though is that I actually need to remove the entire grouping from the df if even just the PRE period doesn’t satisfy the > 10 condition. So in that case I would need all 456 rows and 789 rows removed just because their PRE period row was below 10. How might I adjust my code to accomplish this?
Expected Output:
Name Letter Period Amount 123 H PRE 11 123 H DURING 5 123 H POST 100
Advertisement
Answer
Please try:
df.loc[df['Name'].isin(df['Name'].loc[ (df['Period'] == 'PRE' ) & (df['Amount'] > 10)])]
Prints:
Name Letter Period Amount 0 123 H PRE 11 1 123 H DURING 5 2 123 H POST 100