Skip to content
Advertisement

Efficiently compare running total for month to total for month

I have a dataframe (df). It contains predicted daily data from a model, up until the end of 2020. As each day passes in the year, actual and id data is added to the row. There are multiple names for each day

JavaScript

I want to add an additional column named payout. The payout should be 0 unless the sum of actual, month to date has passed the sum of predicted.

I.e., for Nir, we can see the sum of predicted is 4200. So the payout should be 0 until the sum of actual passes 4200. Once that threshold is passed, then the payout should be 1% of actual-predicted. With the above data, the output would look like this:

JavaScript

In the above output, Xyc has a total predicted 2000, so payout should be 0 until the sum of actual passes 2000 also. In the real dataframe, there is daily data for ~70 names, so I feel like a grouping may be needed.


I’ve tried:

JavaScript

However, that simply gave me a running total of actual. I also tried this:

JavaScript

But the above doesn’t make sure the month to date >= total for the month before attributing the 1%.

Advertisement

Answer

First You need to remove NaN rows from data.

Here You go:

JavaScript

Result:

JavaScript
Advertisement