I am trying to calculate a running total per customer for the previous 365 days using pandas but my code isn’t working. My intended output would be something like this: date customer daily_total_per_customer rolling_total 2016-07-29 1 100 100 2016-08-01 1 50 150 2017-01-12 1 80 230 2017-10-23 1 180 260 2018-03-03 1 0 180 2018-03-06 1 40 220 2019-03-16 1
Tag: window-functions
Spark SQL Row_number() PartitionBy Sort Desc
I’ve successfully create a row_number() partitionBy by in Spark using Window, but would like to sort this by descending, instead of the default ascending. Here is my working code: That gives me this result: And here I add the desc() to order descending: And get this error: AttributeError: ‘WindowSpec’ object has no attribute ‘desc’ What am I doing wrong here?
Pandas get topmost n records within each group
Suppose I have pandas DataFrame like this: which looks like: I want to get a new DataFrame with top 2 records for each id, like this: I can do it with numbering records within group after groupby: which looks like: then for the desired output: Output: But is there more effective/elegant approach to do this? And also is there more