I have such a dataframe:
ds y 2018-07-25 22:00:00 1 2018-07-25 23:00:00 2 2018-07-26 00:00:00 3 2018-07-26 01:00:00 4 2018-07-26 02:00:00 5
What I want to get is a new dataframe which looks like this
ds y 2018-07-25 3 2018-07-26 12
I want to get a new dataframe df1 where all the entries of one day are summed up in y
and I only want to keep one column of this day without a timestamp.
What I did so far is this:
df1 = df.groupby(df.index.date).transform(lambda x: x[:24].sum())
24 because I have 24 entries every day (for every hour). I get the correct sum for every day but I also get 24 rows for every day together with the existing timestamps. How can I achieve what I want?
Advertisement
Answer
If need sum all values per days then filtering first 24 rows is not necessary:
df1 = df.groupby(df.index.date)['y'].sum().reset_index()