pandas consecutive Boolean event rollup time series

Question

Here's some made up time series data on 1 minute intervals: This is just some code to create some Boolean columns On my screen this prints: What I am trying to figure out is how to rollup per hour cumulative events (True or 1) but if there is no 0 between events, its the same event! Hopefully that makes sense

Accepted Answer

Check if the current row (&#8220;2019-01-01 00:02:00&#8221;) equals to 1 and check if the previous row (&#8220;2019-01-01 00:01:00&#8221;) is not equal to 1. This removes consecutive 1 of the sum.>>> df.resample('H').apply(lambda x: (x.eq(1) & x.shift().ne(1)).sum())                     condition1_bool  condition2_bool  condition3_bool2019-01-01 00:00:00                4                8                42019-01-01 01:00:00                9                7                62019-01-01 02:00:00                7               14                42019-01-01 03:00:00                2                8                72019-01-01 04:00:00                4                9                5...                              ...              ...              ...2019-01-06 21:00:00                4                8                22019-01-06 22:00:00                3               11                42019-01-06 23:00:00                6               11                42019-01-07 00:00:00                8                7                82019-01-07 01:00:00                4                9                6[146 rows x 3 columns]Using your code:>>> df.resample('H').sum()                     condition1_bool  condition2_bool  condition3_bool2019-01-01 00:00:00                5                8                52019-01-01 01:00:00                9                8                62019-01-01 02:00:00                7               14                52019-01-01 03:00:00                2                9                72019-01-01 04:00:00                4               11                5...                              ...              ...              ...2019-01-06 21:00:00                5               11                32019-01-06 22:00:00                3               15                42019-01-06 23:00:00                6               12                42019-01-07 00:00:00                8                7               102019-01-07 01:00:00                4                9                7[146 rows x 3 columns]Check:dti = pd.date_range('2021-11-15 21:00:00', '2021-11-15 22:00:00',                     closed='left', freq='T')df1 = pd.DataFrame({'c1': 1}, index=dti)>>> df1.resample('H').apply(lambda x: (x.eq(1) & x.shift().ne(1)).sum())                     c12021-11-15 21:00:00   1>>> df1.resample('H').sum()                     c12021-11-15 21:00:00  60

Advertisement

Answer