I would like to learn how to specify a “subset-sum” in a dataframe
My dataframe looks like this:
The Data/Time column is the dataframes’ index
With
Sum = data['A'].sum()
I get the total sum of column A.
My aim is to sum up a subset of rows only like between 2022-03-18 07:37:51 and 2022-03-18 07:37:55
so that I get a “sum” row:
How can I specify the row numbers to be summed for each column, especially when the index is in datetime format?
Advertisement
Answer
data[(data.index >= '2020-01-01 00:00:16') & (data.index <= '2020-01-01 00:00:17')].sum(axis=0)
simply, use axis = 0 for each coulmn sum, and check by a.index >= ‘2020-01-01 00:00:16’ and equivalent for upper bound
if you want to use datetime module:
from datetime import datetime data[(data.index >= datetime(2020, 1, 1, 0, 0, 16)) & (data.index <= datetime(2020, 1, 1, 0, 0, 17))].sum(axis=0)