Pandas grouping by week

Question

I have a data frame in pandas like this: I need to get the count of items by week. Example: From 9/6 to 9/13, the output should be: Similarly, I need to find the count on these intervals: 9/13 to 9/20, 9/20 to 9/27, and 9/27 to 10/4. Thank you! Answer May be with the caveat of the definition of

Accepted Answer

May be with the caveat of the definition of the first day of a week, you could take something in the following code.df = pd.DataFrame(data=d)df['Date']=pd.to_datetime(df['Date'])I. Discontinuous indexMonday is chosen as the first day of week#(1) Build a series of first_day_of_week, monday is chosen as the first day of weekweeks_index = df['Date'] - df['Date'].dt.weekday * np.timedelta64(1, 'D') #(2) Groupby and some tidyingdf2 = ( df.groupby([df['Name'], weeks_index])          .count()          .rename(columns={'Date':'Count'})                  .swaplevel()   # weeks to first level          .sort_index()           .unstack(1).fillna(0.0)                  .astype(int)          .rename_axis('first_day_of_week')      )>>> print(df2)Name                  A  B  C  D  K  M  Rfirst_day_of_week                        2021-08-30            1  0  0  0  0  0  02021-09-06            0  0  3  1  0  0  02021-09-13            0  0  0  0  1  0  02021-09-20            0  0  0  1  0  0  12021-09-27            0  0  0  0  1  1  02021-10-18            0  1  0  0  0  0  0II. Continuous indexThis part does not differ much of the previous one.We build a continuous version of the index to be use to reindexMonday is chosen as the first day of week (obviouly for the two indices)#(1a) Build a series of first_day_of_week, monday is chosen as the weeks_index = df['Date'] - df['Date'].dt.weekday * np.timedelta64(1, 'D')#(1b) Build a continuous series of first_day_of_weekcontinuous_weeks_index = pd.date_range(start=weeks_index.min(),                                  end=weeks_index.max(),                                 freq='W-MON')    # monday#(2) Groupby, unstack, reindex, and some tidyingdf2 = ( df          # groupby and count          .groupby([df['Name'], weeks_index])          .count()          .rename(columns={'Date':'Count'})                  # unstack on weeks           .swaplevel()    # weeks to first level          .sort_index()          .unstack(1)          # reindex to insert weeks with no data          .reindex(continuous_weeks_index)  # new index                  # clean up          .fillna(0.0)                         .astype(int)          .rename_axis('first_day_of_week')      )>>>print(df2)Name               A  B  C  D  K  M  Rfirst_day_of_week                     2021-08-30         1  0  0  0  0  0  02021-09-06         0  0  3  1  0  0  02021-09-13         0  0  0  0  1  0  02021-09-20         0  0  0  1  0  0  12021-09-27         0  0  0  0  1  1  02021-10-04         0  0  0  0  0  0  02021-10-11         0  0  0  0  0  0  02021-10-18         0  1  0  0  0  0  0Last step if neededdf2.stack()

Advertisement

Answer

I. Discontinuous index

II. Continuous index

Last step if needed