I have a list of values and I want to get their rolling frequency, so something like this:
df = pd.DataFrame({ 'val': [1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3] }) result = df.val.rolling(3).freq() result == pd.Series([1, 2, 3, 3, 3, 1, 2, 3, 3, 3, 1, 2, 3, 3, 3])
Of course I can do this with a loop but with a lot of data it can be computationally expensive so I’d much rather use a built-in or something vectorized, etc. But unfortunately, from my searching, there doesn’t seem to be a solution.
Thanks in advance!
Advertisement
Answer
The first n-1
elements of the result of a rolling function with window size n
must be NaN
per definition.
result = df.val.rolling(3).apply(lambda x: np.count_nonzero(x==x.iloc[-1])).astype('Int64')
Result:
0 <NA> 1 <NA> 2 3 3 3 4 3 5 1 6 2 7 3 8 3 9 3 10 1 11 2 12 3 13 3 14 3