Hi Folks, I need to take first and last value from each group(where the counter value is 1 consecutively )
My Input :-
JavaScript
x
16
16
1
TIMESTAMP,COUNTER
2
2019-03-19:13:50,0
3
2019-03-19:14:00,0
4
2019-03-19:14:10,0
5
2019-03-19:14:20,0
6
2019-03-19:14:30,0
7
2019-03-19:14:40,1
8
2019-03-19:14:50,1
9
2019-03-19:15:00,1
10
2019-03-19:15:10,0
11
2019-03-19:15:20,0
12
2019-03-19:15:30,0
13
2019-03-19:15:40,1
14
2019-03-19:15:50,1
15
2019-03-19:16:00,1
16
Desired Output :-
JavaScript
1
3
1
2019-03-19:14:40,2019-03-19:15:00
2
2019-03-19:15:40,2019-03-19:16:00
3
Advertisement
Answer
You can aggregate by consecutive 1
values with aggregate minimal and maximal TIMESTAMP
:
JavaScript
1
10
10
1
m = df['COUNTER'].ne(1)
2
3
df = (df[~m].groupby((m | m.shift()).cumsum())
4
.agg(TIMESTAMP_min=('TIMESTAMP','min'), TIMESTAMP_max=('TIMESTAMP','max'))
5
.reset_index(drop=True))
6
print (df)
7
TIMESTAMP_min TIMESTAMP_max
8
0 2019-03-19:14:40 2019-03-19:15:00
9
1 2019-03-19:15:40 2019-03-19:16:00
10
EDIT: Test groups:
JavaScript
1
29
29
1
print (df)
2
3
TIMESTAMP COUNTER
4
0 2019-03-19:13:50 0
5
1 2019-03-19:14:00 0
6
2 2019-03-19:14:10 0
7
3 2019-03-19:14:20 0
8
4 2019-03-19:14:30 0
9
5 2019-03-19:14:40 1
10
6 2019-03-19:14:50 1
11
7 2019-03-19:15:00 1
12
8 2019-03-19:15:10 0
13
9 2019-03-19:15:20 0
14
10 2019-03-19:15:30 0
15
11 2019-03-19:15:40 1
16
12 2019-03-19:15:50 1
17
13 2019-03-19:16:00 1
18
19
20
m = df['COUNTER'].ne(1)
21
print ((m | m.shift()).cumsum()[~m])
22
5 6
23
6 6
24
7 6
25
11 10
26
12 10
27
13 10
28
Name: COUNTER, dtype: int32
29