I have this data frame:
JavaScript
x
23
23
1
ID Date X 123_Var 456_Var 789_Var
2
A 16-07-19 3 777 250 810
3
A 17-07-19 9 637 121 529
4
A 20-07-19 2 295 272 490
5
A 21-07-19 3 778 600 544
6
A 22-07-19 6 741 792 907
7
A 25-07-19 6 435 416 820
8
A 26-07-19 8 590 455 342
9
A 27-07-19 6 763 476 753
10
A 02-08-19 6 717 211 454
11
A 03-08-19 6 152 442 475
12
A 05-08-19 6 564 340 302
13
A 07-08-19 6 105 929 633
14
A 08-08-19 6 948 366 586
15
B 07-08-19 4 509 690 406
16
B 08-08-19 2 413 725 414
17
B 12-08-19 2 170 702 912
18
B 13-08-19 3 851 616 477
19
B 14-08-19 9 475 447 555
20
B 15-08-19 1 412 403 708
21
B 17-08-19 2 299 537 321
22
B 18-08-19 4 310 119 125
23
I want to show the min
value of n
last days (say, n = 4
), using Date
column, excluding the value of current day.
A similar solution has provided by jezrael. (That one calculates the mean
, and not min
.)
Expected result:
JavaScript
1
23
23
1
ID Date X 123_Var 456_Var 789_Var 123_Var_4 456_Var_4 789_Var_4
2
A 16-07-19 3 777 250 810 NaN NaN NaN
3
A 17-07-19 9 637 121 529 777.0 250.0 810.0
4
A 20-07-19 2 295 272 490 637.0 121.0 529.0
5
A 21-07-19 3 778 600 544 295.0 121.0 490.0
6
A 22-07-19 6 741 792 907 295.0 272.0 490.0
7
A 25-07-19 6 435 416 820 741.0 600.0 544.0
8
A 26-07-19 8 590 455 342 435.0 416.0 820.0
9
A 27-07-19 6 763 476 753 435.0 416.0 342.0
10
A 02-08-19 6 717 211 454 NaN NaN NaN
11
A 03-08-19 6 152 442 475 717.0 211.0 454.0
12
A 05-08-19 6 564 340 302 152.0 211.0 454.0
13
A 07-08-19 6 105 929 633 152.0 340.0 302.0
14
A 08-08-19 6 948 366 586 105.0 340.0 302.0
15
B 07-08-19 4 509 690 406 NaN NaN NaN
16
B 08-08-19 2 413 725 414 509.0 690.0 406.0
17
B 12-08-19 2 170 702 912 413.0 725.0 414.0
18
B 13-08-19 3 851 616 477 170.0 702.0 414.0
19
B 14-08-19 9 475 447 555 170.0 616.0 477.0
20
B 15-08-19 1 412 403 708 170.0 447.0 477.0
21
B 17-08-19 2 299 537 321 412.0 403.0 477.0
22
B 18-08-19 4 310 119 125 299.0 403.0 321.0
23
Advertisement
Answer
Use similar solution like @Chris with custom lambda function in GroupBy.apply
and last join to original by DataFrame.join
:
JavaScript
1
11
11
1
df['Date'] = pd.to_datetime(df['Date'], dayfirst=True)
2
3
n = 4
4
cols = df.filter(regex='Var').columns
5
f = lambda x: x.asfreq('d').rolling(window=f'{n+1}D',closed="left")[cols].min()
6
7
df2 = (df.set_index('Date')
8
.groupby('ID').apply(f)
9
.add_suffix(f'_{n}'))
10
df = df.join(df2, on=['ID','Date'])
11
JavaScript
1
47
47
1
print (df)
2
ID Date X 123_Var 456_Var 789_Var 123_Var_4 456_Var_4
3
0 A 2019-07-16 3 777 250 810 NaN NaN
4
1 A 2019-07-17 9 637 121 529 777.0 250.0
5
2 A 2019-07-20 2 295 272 490 637.0 121.0
6
3 A 2019-07-21 3 778 600 544 295.0 121.0
7
4 A 2019-07-22 6 741 792 907 295.0 121.0
8
5 A 2019-07-25 6 435 416 820 295.0 272.0
9
6 A 2019-07-26 8 590 455 342 435.0 416.0
10
7 A 2019-07-27 6 763 476 753 435.0 416.0
11
8 A 2019-08-02 6 717 211 454 NaN NaN
12
9 A 2019-08-03 6 152 442 475 717.0 211.0
13
10 A 2019-08-05 6 564 340 302 152.0 211.0
14
11 A 2019-08-07 6 105 929 633 152.0 211.0
15
12 A 2019-08-08 6 948 366 586 105.0 340.0
16
13 B 2019-08-07 4 509 690 406 NaN NaN
17
14 B 2019-08-08 2 413 725 414 509.0 690.0
18
15 B 2019-08-12 2 170 702 912 413.0 690.0
19
16 B 2019-08-13 3 851 616 477 170.0 702.0
20
17 B 2019-08-14 9 475 447 555 170.0 616.0
21
18 B 2019-08-15 1 412 403 708 170.0 447.0
22
19 B 2019-08-17 2 299 537 321 170.0 403.0
23
20 B 2019-08-18 4 310 119 125 299.0 403.0
24
25
789_Var_4
26
0 NaN
27
1 810.0
28
2 529.0
29
3 490.0
30
4 490.0
31
5 490.0
32
6 544.0
33
7 342.0
34
8 NaN
35
9 454.0
36
10 454.0
37
11 302.0
38
12 302.0
39
13 NaN
40
14 406.0
41
15 406.0
42
16 414.0
43
17 477.0
44
18 477.0
45
19 477.0
46
20 321.0
47