Skip to content
Advertisement

How to set value of first several rows in a Pandas Dataframe for each Group

I am a noob to groupby methods in Pandas and can’t seem to get my head wrapped around it. I have data with ~2M records and my current code will take 4 days to execute – due to the inefficient use of ‘append’.

I am analyzing data from manufacturing with 2 flags for indicating problems with the test specimens. The first few flags from each Test_ID should be set to False. (Reason: there is not sufficient data to accurately analyze these first few of each group)

My inefficient attempt (right result, but not fast enought for 2M rows):

JavaScript

Input:

JavaScript

Output:

JavaScript

Advertisement

Answer

Let’s do groupby().cumcount():

JavaScript

Output:

JavaScript
User contributions licensed under: CC BY-SA
4 People found this is helpful
Advertisement