I have a matrix with 0s and 1s, and want to do a cumsum on each column that resets to 0 whenever a zero is observed. For example, if we have the following:
df = pd.DataFrame([[0,1],[1,1],[0,1],[1,0],[1,1],[0,1]],columns = ['a','b']) print(df) a b 0 0 1 1 1 1 2 0 1 3 1 0 4 1 1 5 0 1
The result I desire is:
print(df) a b 0 0 1 1 1 2 2 0 3 3 1 0 4 2 1 5 0 2
However, when I try df.cumsum() * df
, I am able to correctly identify the 0 elements, but the counter does not reset:
print(df.cumsum() * df) a b 0 0 1 1 1 2 2 0 3 3 2 0 4 3 4 5 0 5
Advertisement
Answer
You can use:
a = df != 0 df1 = a.cumsum()-a.cumsum().where(~a).ffill().fillna(0).astype(int) print (df1) a b 0 0 1 1 1 2 2 0 3 3 1 0 4 2 1 5 0 2