I have a list of values and I want to get their rolling frequency, so something like this:
JavaScript
x
8
1
df = pd.DataFrame({
2
'val': [1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3]
3
})
4
5
result = df.val.rolling(3).freq()
6
7
result == pd.Series([1, 2, 3, 3, 3, 1, 2, 3, 3, 3, 1, 2, 3, 3, 3])
8
Of course I can do this with a loop but with a lot of data it can be computationally expensive so I’d much rather use a built-in or something vectorized, etc. But unfortunately, from my searching, there doesn’t seem to be a solution.
Thanks in advance!
Advertisement
Answer
The first n-1
elements of the result of a rolling function with window size n
must be NaN
per definition.
JavaScript
1
2
1
result = df.val.rolling(3).apply(lambda x: np.count_nonzero(x==x.iloc[-1])).astype('Int64')
2
Result:
JavaScript
1
16
16
1
0 <NA>
2
1 <NA>
3
2 3
4
3 3
5
4 3
6
5 1
7
6 2
8
7 3
9
8 3
10
9 3
11
10 1
12
11 2
13
12 3
14
13 3
15
14 3
16