Python pandas dataframe with daily data – keep first and last rows per month

Question

I have a Python pandas dataframe that looks like this: I want to keep the first and the last row per month. How can I do that? I tried using the following code: but I don't get the results I want. Answer pandas groupby operations don't sort each group prior to aggregation, which is why 'first' and 'last' are not

Accepted Answer

pandas groupby operations don&#8217;t sort each group prior to aggregation, which is why 'first' and 'last' are not selecting the correct rows for you.Additionally, you can use .resample('M') instead of a groupby on year & month.out = (    df.set_index(df.index.astype('datetime64[ns]')) # copying in the data, I lost the datetime index    .sort_index()  # sort ensures first and last work as expected    .resample('M') # resample for a shorthand year/month grouping    .agg(['first', 'last']))print(out)                    Date                 Close*          month_initial      month        day       year                         first          last    first     last         first last first last first last first  lastdate_final                                                                                                   2022-08-31  Aug 12, 2022  Aug 31, 2022  4280.15  3955.00           Aug  Aug   8.0  8.0    12   31  2022  20222022-09-30  Sep 01, 2022  Sep 23, 2022  3966.85  3693.23           Sep  Sep   9.0  9.0     1   23  2022  2022This output doesn&#8217;t have the most usable format, so we can use a quick .stack to remedy it:out = out.stack()print(out)                          Date   Close* month_initial  month  day  yeardate_final                                                             2022-08-31 first  Aug 12, 2022  4280.15           Aug    8.0   12  2022           last   Aug 31, 2022  3955.00           Aug    8.0   31  20222022-09-30 first  Sep 01, 2022  3966.85           Sep    9.0    1  2022           last   Sep 23, 2022  3693.23           Sep    9.0   23  2022

Advertisement

Answer