Iterating through pandas groupby groups

Question

I have a pandas dataframe school_df that looks like this: Each row represents one project by that school. I&#8217;d like to add two columns: for each unique school_id, a count of how many projects were posted before that date and a count of how many projects were completed before that date. The code below wor…

Accepted Answer

Here is a version using cumcount (I simplified the dates, but still should work):import pandas as pdimport iodf = pd.DataFrame({'school_id': ['A', 'A', 'A', 'B', 'B'],                   'date_posted': pd.date_range('2014-01-01', '2014-01-05'),                   'date_completed': pd.date_range('2014-01-01', '2014-01-05')})posted = df.set_index('date_posted').groupby('school_id').cumcount()comp = df.set_index('date_completed').groupby('school_id').cumcount()df['posted'] = posted.valuesdf['comp'] = comp.valuesprint dfResults in:  date_completed date_posted school_id  posted  comp 0     2014-01-01  2014-01-01         A       0     0 1     2014-01-02  2014-01-02         A       1     1 2     2014-01-03  2014-01-03         A       2     2 3     2014-01-04  2014-01-04         B       0     0 4     2014-01-05  2014-01-05         B       1     1

Advertisement

Answer