Here is some example data:
mydf = {'Month': [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12], 'Freq': [5, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60] } my_df = pd.DataFrame(mydf, columns=['Month', 'Freq']) my_df Month Freq 0 1 5 1 2 10 2 3 15 3 4 20 4 5 25 5 6 30 6 7 35 7 8 40 8 9 45 9 10 50 10 11 55 11 12 60
How can I create a new dataframe which groups the months into seasons and find the total sum of each season frequency, while the output is still a dataframe?
I would like something like this: (Winter is where Month = 12, 1, 2)(Spring is where Month = 3, 4, 5)(etc….)
Season Freq 0 Winter 75 1 Spring 60 2 Summer 105 3 Autumn 150
I have tried to select the rows and concatenate them to start with but I keep getting errors unfortunately.
Advertisement
Answer
You can create a new column with seasons and group on that column:
my_df['Season']=df['Month'].apply(lambda x: 'Winter' if x in (12,1,2) else 'Spring' if x in (3,4,5) else 'Summer' if x in (6,7,8) else 'Autumn') res=my_df.groupby('Season')['Freq'].sum() >>> print(res) Season Autumn 150 Spring 60 Summer 105 Winter 75