JavaScript
x
4
1
df = pd.DataFrame(np.random.randint(0,2,size=(5,3)), columns=list('ABC'))
2
3
print (df)
4
I would like to create a forth column “D” which will take a value of 1 if:
- (at least) two column (A, B, C) have a value of 1 or
- the previous 2 periods had at least two columns with a value of 1.
According to the example above all the rows would have df['D']==1
Advertisement
Answer
We can look for the 3-window rolling sum of a boolean series that marks at least 2 ones per row, and check if the result is 0 (so that D
will be too) or not:
JavaScript
1
2
1
df["D"] = df.sum(axis=1).ge(2).rolling(3, min_periods=1).sum().ne(0).astype(int)
2
samples:
JavaScript
1
57
57
1
>>> df1
2
3
A B C
4
0 0 0 1
5
1 0 1 0
6
2 1 1 1
7
3 0 1 0
8
4 1 1 1
9
5 1 0 0
10
6 1 0 0
11
7 0 0 1
12
8 1 0 1
13
9 1 0 0
14
15
>>> # after..
16
17
A B C D
18
0 0 0 1 0
19
1 0 1 0 0
20
2 1 1 1 1
21
3 0 1 0 1
22
4 1 1 1 1
23
5 1 0 0 1
24
6 1 0 0 1
25
7 0 0 1 0
26
8 1 0 1 1
27
9 1 0 0 1
28
29
30
>>> df2
31
A B C
32
0 0 0 0
33
1 0 1 0
34
2 0 0 0
35
3 1 1 0
36
4 0 1 1
37
5 1 0 0
38
6 0 0 0
39
7 1 1 1
40
8 0 1 1
41
9 0 0 0
42
43
>>> # after...
44
45
A B C D
46
0 0 0 0 0
47
1 0 1 0 0
48
2 0 0 0 0
49
3 1 1 0 1
50
4 0 1 1 1
51
5 1 0 0 1
52
6 0 0 0 1
53
7 1 1 1 1
54
8 0 1 1 1
55
9 0 0 0 1
56
57