Here is some example data:
mydf = {'Month': [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12],
'Freq': [5, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60]
}
my_df = pd.DataFrame(mydf, columns=['Month', 'Freq'])
my_df
Month Freq
0 1 5
1 2 10
2 3 15
3 4 20
4 5 25
5 6 30
6 7 35
7 8 40
8 9 45
9 10 50
10 11 55
11 12 60
How can I create a new dataframe which groups the months into seasons and find the total sum of each season frequency, while the output is still a dataframe?
I would like something like this: (Winter is where Month = 12, 1, 2)(Spring is where Month = 3, 4, 5)(etc….)
Season Freq 0 Winter 75 1 Spring 60 2 Summer 105 3 Autumn 150
I have tried to select the rows and concatenate them to start with but I keep getting errors unfortunately.
Advertisement
Answer
You can create a new column with seasons and group on that column:
my_df['Season']=df['Month'].apply(lambda x: 'Winter' if x in (12,1,2) else 'Spring' if x in (3,4,5) else 'Summer' if x in (6,7,8) else 'Autumn')
res=my_df.groupby('Season')['Freq'].sum()
>>> print(res)
Season
Autumn 150
Spring 60
Summer 105
Winter 75