Skip to content
Advertisement

PySpark Incremental Count on Condition

Given a Spark dataframe with the following columns I am trying to construct an incremental/running count for each id based on when the contents of the event column evaluate to True.

JavaScript

Here a new column called results would be created that contained the incremental count.

JavaScript

I’ve tried using window functions but am stumped at this point. Ideally, the solution would increment the count row-wise without and group by or aggregation functions.

Thanks in advance.

Advertisement

Answer

You can use sum function, casting your event as an int:

JavaScript
User contributions licensed under: CC BY-SA
7 People found this is helpful
Advertisement