Hi Folks, I need to take first and last value from each group(where the counter value is 1 consecutively )
My Input :-
TIMESTAMP,COUNTER 2019-03-19:13:50,0 2019-03-19:14:00,0 2019-03-19:14:10,0 2019-03-19:14:20,0 2019-03-19:14:30,0 2019-03-19:14:40,1 2019-03-19:14:50,1 2019-03-19:15:00,1 2019-03-19:15:10,0 2019-03-19:15:20,0 2019-03-19:15:30,0 2019-03-19:15:40,1 2019-03-19:15:50,1 2019-03-19:16:00,1
Desired Output :-
2019-03-19:14:40,2019-03-19:15:00 2019-03-19:15:40,2019-03-19:16:00
Advertisement
Answer
You can aggregate by consecutive 1
values with aggregate minimal and maximal TIMESTAMP
:
m = df['COUNTER'].ne(1) df = (df[~m].groupby((m | m.shift()).cumsum()) .agg(TIMESTAMP_min=('TIMESTAMP','min'), TIMESTAMP_max=('TIMESTAMP','max')) .reset_index(drop=True)) print (df) TIMESTAMP_min TIMESTAMP_max 0 2019-03-19:14:40 2019-03-19:15:00 1 2019-03-19:15:40 2019-03-19:16:00
EDIT: Test groups:
print (df) TIMESTAMP COUNTER 0 2019-03-19:13:50 0 1 2019-03-19:14:00 0 2 2019-03-19:14:10 0 3 2019-03-19:14:20 0 4 2019-03-19:14:30 0 5 2019-03-19:14:40 1 6 2019-03-19:14:50 1 7 2019-03-19:15:00 1 8 2019-03-19:15:10 0 9 2019-03-19:15:20 0 10 2019-03-19:15:30 0 11 2019-03-19:15:40 1 12 2019-03-19:15:50 1 13 2019-03-19:16:00 1 m = df['COUNTER'].ne(1) print ((m | m.shift()).cumsum()[~m]) 5 6 6 6 7 6 11 10 12 10 13 10 Name: COUNTER, dtype: int32