I’ve got a dataframe with two columns one is datetime dataframe consisting of dates, and another one consists of quantity. It looks like something like this,
JavaScript
x
19
19
1
Date Quantity
2
0 2019-01-05 10
3
1 2019-01-10 15
4
2 2019-01-22 14
5
3 2019-02-03 12
6
4 2019-05-11 25
7
5 2019-05-21 4
8
6 2019-07-08 1
9
7 2019-07-30 15
10
8 2019-09-05 31
11
9 2019-09-10 44
12
10 2019-09-25 8
13
11 2019-12-09 10
14
12 2020-04-11 111
15
13 2020-04-17 5
16
14 2020-06-05 17
17
15 2020-06-16 12
18
16 2020-06-22 14
19
I want to make another dataframe. It should consist of two columns one is Month/Year and the other is Till Highest. I basically want to calculate the highest quantity value until that month and group it using month/year. Example of what I want precisely is,
JavaScript
1
10
10
1
Month/Year Till Highest
2
0 Jan/2019 15
3
1 Feb/2019 15
4
2 May/2019 25
5
3 Jul/2019 25
6
4 Sep/2019 44
7
5 Dec/2019 44
8
6 Apr/2020 111
9
7 Jun/2020 111
10
In my case, the dataset is vast, and I’ve readings of almost every day of each month and each year in the specified timeline. Here I’ve made a dummy dataset to show an example of what I want.
Please help me with this. Thanks in advance :)
Advertisement
Answer
In R you can use cummax
:
JavaScript
1
16
16
1
df=data.frame(Date=c("2019-01-05","2019-01-10","2019-01-22","2019-02-03","2019-05-11","2019-05-21","2019-07-08","2019-07-30","2019-09-05","2019-09-10","2019-09-25","2019-12-09","2020-04-11","2020-04-17","2020-06-05","2020-06-16","2020-06-22"),Quantity=c(10,15,14,12,25,4,1,15,31,44,8,10,111,5,17,12,14))
2
3
data.frame(`Month/Year`=unique(format(as.Date(df$Date),"%b/%Y")),
4
`Till Highest`=cummax(tapply(df$Quantity,sub("-..$","",df$Date),max)),
5
check.names=F,row.names=NULL)
6
7
Month/Year Till Highest
8
1 Jan/2019 15
9
2 Feb/2019 15
10
3 May/2019 25
11
4 Jul/2019 25
12
5 Sep/2019 44
13
6 Dec/2019 44
14
7 Apr/2020 111
15
8 Jun/2020 111
16