I have a matrix with 0s and 1s, and want to do a cumsum on each column that resets to 0 whenever a zero is observed. For example, if we have the following:
JavaScript
x
10
10
1
df = pd.DataFrame([[0,1],[1,1],[0,1],[1,0],[1,1],[0,1]],columns = ['a','b'])
2
print(df)
3
a b
4
0 0 1
5
1 1 1
6
2 0 1
7
3 1 0
8
4 1 1
9
5 0 1
10
The result I desire is:
JavaScript
1
9
1
print(df)
2
a b
3
0 0 1
4
1 1 2
5
2 0 3
6
3 1 0
7
4 2 1
8
5 0 2
9
However, when I try df.cumsum() * df
, I am able to correctly identify the 0 elements, but the counter does not reset:
JavaScript
1
9
1
print(df.cumsum() * df)
2
a b
3
0 0 1
4
1 1 2
5
2 0 3
6
3 2 0
7
4 3 4
8
5 0 5
9
Advertisement
Answer
You can use:
JavaScript
1
11
11
1
a = df != 0
2
df1 = a.cumsum()-a.cumsum().where(~a).ffill().fillna(0).astype(int)
3
print (df1)
4
a b
5
0 0 1
6
1 1 2
7
2 0 3
8
3 1 0
9
4 2 1
10
5 0 2
11