Skip to content
Advertisement

How to group by month and year from a specific range?

The data have reported values for January 2006 through January 2019. I need to compute the total number of passengers Passenger_Count per month. The dataframe should have 121 entries (10 years * 12 months, plus 1 for january 2019). The range should go from 2009 to 2019.

The data

I have been doing:

df.groupby(['ReportPeriod'])['Passenger_Count'].sum()

But it doesn’t give me the right result, it gives

Wrong result

Advertisement

Answer

You can do

df['ReportPeriod'] = pd.to_datetime(df['ReportPeriod'])
out = df.groupby(df['ReportPeriod'].dt.strftime('%Y-%m-%d'))['Passenger_Count'].sum()
User contributions licensed under: CC BY-SA
3 People found this is helpful
Advertisement