Skip to content
Advertisement

How to select specific rows in a dataframe, group them and find the sum using python?

Here is some example data:

mydf = {'Month': [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12],
        'Freq': [5, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60]
        }
my_df = pd.DataFrame(mydf, columns=['Month', 'Freq'])
my_df

  Month Freq
0   1   5
1   2   10
2   3   15
3   4   20
4   5   25
5   6   30
6   7   35
7   8   40
8   9   45
9   10  50
10  11  55
11  12  60

How can I create a new dataframe which groups the months into seasons and find the total sum of each season frequency, while the output is still a dataframe?

I would like something like this: (Winter is where Month = 12, 1, 2)(Spring is where Month = 3, 4, 5)(etc….)

   Season Freq
0  Winter 75
1  Spring 60
2  Summer 105
3  Autumn 150

I have tried to select the rows and concatenate them to start with but I keep getting errors unfortunately.

Advertisement

Answer

You can create a new column with seasons and group on that column:

my_df['Season']=df['Month'].apply(lambda x: 'Winter' if x in (12,1,2) else 'Spring' if x in (3,4,5) else 'Summer' if x in (6,7,8) else 'Autumn')

res=my_df.groupby('Season')['Freq'].sum()

>>> print(res)

Season
Autumn    150
Spring     60
Summer    105
Winter     75
User contributions licensed under: CC BY-SA
9 People found this is helpful
Advertisement