Compute daily climatology using pandas python

Question

I am trying to use pandas to compute daily climatology. My code is: cum_data is the data frame containing daily dates from 1st Jan 1950 to 31st Dec 1953. I want to create a new vector of length 365 with the first element containing the average of rand_data for January 1st for 1950, 1951, 1952 and 1953. And so on

Accepted Answer

You can groupby the day of the year, and the calculate the mean for these groups:cum_data.groupby(cum_data.index.dayofyear).mean()However, you have the be aware of leap years. This will cause problems with this approach. As alternative, you can also group by the month and the day:In [13]: cum_data.groupby([cum_data.index.month, cum_data.index.day]).mean()Out[13]:1  1     462.25   2     631.00   3     615.50   4     496.00...12  28    378.25    29    427.75    30    528.50    31    678.50Length: 366, dtype: float64

Advertisement

Answer