Skip to content
Advertisement

Pandas: Rolling window to count the frequency – Fastest approach

I would like to count the frequency of a value for the past x days. In the example below, I would like to count the frequency of value in the Name column for the past 28 days. The data is already sorted by Date

JavaScript

I found some solutions on StackOverFlow but all of them are neither correct on the dataset nor fast.

Approach 1 – not quite correct

JavaScript

Approach 2 – correct approach but very slow <~ from this link

JavaScript

Approach 3 – using sum – Not correct

JavaScript

Approach4 – also using sum – not correct as the indexes are not right <~ this link

JavaScript

Output

JavaScript

Performances

JavaScript

Updated <~ Solution

Based on the marked answer, I wrote this solution and I think it works when the data has NULL values and duplicates. Also, it does not change the size of the original dataset.

JavaScript

Advertisement

Answer

IIUC, the issue is coming from your tolist() that messes up with index alignment and shuffles the output.

Use a merge instead:

JavaScript

output:

JavaScript
User contributions licensed under: CC BY-SA
9 People found this is helpful
Advertisement