I’ve got a dataframe with two columns one is datetime dataframe consisting of dates, and another one consists of quantity. It looks like something like this,
Date Quantity 0 2019-01-05 10 1 2019-01-10 15 2 2019-01-22 14 3 2019-02-03 12 4 2019-05-11 25 5 2019-05-21 4 6 2019-07-08 1 7 2019-07-30 15 8 2019-09-05 31 9 2019-09-10 44 10 2019-09-25 8 11 2019-12-09 10 12 2020-04-11 111 13 2020-04-17 5 14 2020-06-05 17 15 2020-06-16 12 16 2020-06-22 14
I want to make another dataframe. It should consist of two columns one is Month/Year and the other is Till Highest. I basically want to calculate the highest quantity value until that month and group it using month/year. Example of what I want precisely is,
Month/Year Till Highest 0 Jan/2019 15 1 Feb/2019 15 2 May/2019 25 3 Jul/2019 25 4 Sep/2019 44 5 Dec/2019 44 6 Apr/2020 111 7 Jun/2020 111
In my case, the dataset is vast, and I’ve readings of almost every day of each month and each year in the specified timeline. Here I’ve made a dummy dataset to show an example of what I want.
Please help me with this. Thanks in advance :)
Advertisement
Answer
In R you can use cummax
:
df=data.frame(Date=c("2019-01-05","2019-01-10","2019-01-22","2019-02-03","2019-05-11","2019-05-21","2019-07-08","2019-07-30","2019-09-05","2019-09-10","2019-09-25","2019-12-09","2020-04-11","2020-04-17","2020-06-05","2020-06-16","2020-06-22"),Quantity=c(10,15,14,12,25,4,1,15,31,44,8,10,111,5,17,12,14)) data.frame(`Month/Year`=unique(format(as.Date(df$Date),"%b/%Y")), `Till Highest`=cummax(tapply(df$Quantity,sub("-..$","",df$Date),max)), check.names=F,row.names=NULL) Month/Year Till Highest 1 Jan/2019 15 2 Feb/2019 15 3 May/2019 25 4 Jul/2019 25 5 Sep/2019 44 6 Dec/2019 44 7 Apr/2020 111 8 Jun/2020 111