Skip to content
Advertisement

Compute rolling z-score in pandas dataframe

Is there a open source function to compute moving z-score like https://turi.com/products/create/docs/generated/graphlab.toolkits.anomaly_detection.moving_zscore.create.html. I have access to pandas rolling_std for computing std, but want to see if it can be extended to compute rolling z scores.

Advertisement

Answer

rolling.apply with a custom function is significantly slower than using builtin rolling functions (such as mean and std). Therefore, compute the rolling z-score from the rolling mean and rolling std:

JavaScript

According to the definition given on this page the rolling z-score depends on the rolling mean and std just prior to the current point. The shift(1) is used above to achieve this effect.


Below, even for a small Series (of length 100), zscore is over 5x faster than using rolling.apply. Since rolling.apply(zscore_func) calls zscore_func once for each rolling window in essentially a Python loop, the advantage of using the Cythonized r.mean() and r.std() functions becomes even more apparent as the size of the loop increases. Thus, as the length of the Series increases, the speed advantage of zscore increases.

JavaScript

This is the setup used for the benchmark:

JavaScript
User contributions licensed under: CC BY-SA
1 People found this is helpful
Advertisement