Skip to content
Advertisement

How can I find rows in Pandas DataFrame where the sum of 2 rows is greater than some value?

In a dataset like the one below, I’m trying to group the rows by attr_1 and attr_2, and if the sum of the count column exceeds a threshold (in this case 100), I want to keep the original rows.

account attr_1 attr_2 count
ABC X1 Y1 25
DEF X1 Y1 100
ABC X2 Y2 150
DEF X2 Y2 0
ABC X3 Y3 10
DEF X3 Y3 15

I am using the messy approach below, but I’d like to see if there is a cleaner way that I could handle this.

JavaScript

Advertisement

Answer

You can use groupby + filter, and in the filter lambda, provides a scalar condition for the group:

JavaScript

Or use groupby + transform to create a filter condition that’s compatible with the original data frame:

JavaScript
User contributions licensed under: CC BY-SA
1 People found this is helpful
Advertisement