Skip to content
Advertisement

count sequence of value in df

I have df:

id  color val
1     y    3
1     y    3
1     y    3
2     y    1
2     r    2

I want to count the condition that the value in val column is <=3 at least 3 times in the same id and color in a row(in sequence), and when the condition is true, to return the id and color. for example I will get here 1,y thanks

Advertisement

Answer

You can use:

N = 3 # number consecutive matches
s = (df['val'].le(3) # condition: value ≤ 3
 .groupby([df['id'], df['color']]) # make groups
 # is there any occurrence where there are N True in a row?
 .apply(lambda s: s.rolling(N).sum().eq(N).any())
)

# keep only the True and convert to list
s[s].index.to_list()

output: [(1, 'y')]

intermediate s:

id  color
1   y         True
2   r        False
    y        False
Name: val, dtype: bool
User contributions licensed under: CC BY-SA
7 People found this is helpful
Advertisement