I have this data frame:
ID Date X 123_Var 456_Var 789_Var A 16-07-19 3 777 250 810 A 17-07-19 9 637 121 529 A 20-07-19 2 295 272 490 A 21-07-19 3 778 600 544 A 22-07-19 6 741 792 907 A 25-07-19 6 435 416 820 A 26-07-19 8 590 455 342 A 27-07-19 6 763 476 753 A 02-08-19 6 717 211 454 A 03-08-19 6 152 442 475 A 05-08-19 6 564 340 302 A 07-08-19 6 105 929 633 A 08-08-19 6 948 366 586 B 07-08-19 4 509 690 406 B 08-08-19 2 413 725 414 B 12-08-19 2 170 702 912 B 13-08-19 3 851 616 477 B 14-08-19 9 475 447 555 B 15-08-19 1 412 403 708 B 17-08-19 2 299 537 321 B 18-08-19 4 310 119 125
I want to show the min value of n last days (say, n = 4), using Date column, excluding the value of current day.
A similar solution has provided by jezrael. (That one calculates the mean, and not min.)
Expected result:
ID Date X 123_Var 456_Var 789_Var 123_Var_4 456_Var_4 789_Var_4 A 16-07-19 3 777 250 810 NaN NaN NaN A 17-07-19 9 637 121 529 777.0 250.0 810.0 A 20-07-19 2 295 272 490 637.0 121.0 529.0 A 21-07-19 3 778 600 544 295.0 121.0 490.0 A 22-07-19 6 741 792 907 295.0 272.0 490.0 A 25-07-19 6 435 416 820 741.0 600.0 544.0 A 26-07-19 8 590 455 342 435.0 416.0 820.0 A 27-07-19 6 763 476 753 435.0 416.0 342.0 A 02-08-19 6 717 211 454 NaN NaN NaN A 03-08-19 6 152 442 475 717.0 211.0 454.0 A 05-08-19 6 564 340 302 152.0 211.0 454.0 A 07-08-19 6 105 929 633 152.0 340.0 302.0 A 08-08-19 6 948 366 586 105.0 340.0 302.0 B 07-08-19 4 509 690 406 NaN NaN NaN B 08-08-19 2 413 725 414 509.0 690.0 406.0 B 12-08-19 2 170 702 912 413.0 725.0 414.0 B 13-08-19 3 851 616 477 170.0 702.0 414.0 B 14-08-19 9 475 447 555 170.0 616.0 477.0 B 15-08-19 1 412 403 708 170.0 447.0 477.0 B 17-08-19 2 299 537 321 412.0 403.0 477.0 B 18-08-19 4 310 119 125 299.0 403.0 321.0
Advertisement
Answer
Use similar solution like @Chris with custom lambda function in GroupBy.apply and last join to original by DataFrame.join:
df['Date'] = pd.to_datetime(df['Date'], dayfirst=True)
n = 4
cols = df.filter(regex='Var').columns
f = lambda x: x.asfreq('d').rolling(window=f'{n+1}D',closed="left")[cols].min()
df2 = (df.set_index('Date')
         .groupby('ID').apply(f)
         .add_suffix(f'_{n}'))
df = df.join(df2, on=['ID','Date'])
print (df)
   ID       Date  X  123_Var  456_Var  789_Var  123_Var_4  456_Var_4  
0   A 2019-07-16  3      777      250      810        NaN        NaN   
1   A 2019-07-17  9      637      121      529      777.0      250.0   
2   A 2019-07-20  2      295      272      490      637.0      121.0   
3   A 2019-07-21  3      778      600      544      295.0      121.0   
4   A 2019-07-22  6      741      792      907      295.0      121.0   
5   A 2019-07-25  6      435      416      820      295.0      272.0   
6   A 2019-07-26  8      590      455      342      435.0      416.0   
7   A 2019-07-27  6      763      476      753      435.0      416.0   
8   A 2019-08-02  6      717      211      454        NaN        NaN   
9   A 2019-08-03  6      152      442      475      717.0      211.0   
10  A 2019-08-05  6      564      340      302      152.0      211.0   
11  A 2019-08-07  6      105      929      633      152.0      211.0   
12  A 2019-08-08  6      948      366      586      105.0      340.0   
13  B 2019-08-07  4      509      690      406        NaN        NaN   
14  B 2019-08-08  2      413      725      414      509.0      690.0   
15  B 2019-08-12  2      170      702      912      413.0      690.0   
16  B 2019-08-13  3      851      616      477      170.0      702.0   
17  B 2019-08-14  9      475      447      555      170.0      616.0   
18  B 2019-08-15  1      412      403      708      170.0      447.0   
19  B 2019-08-17  2      299      537      321      170.0      403.0   
20  B 2019-08-18  4      310      119      125      299.0      403.0   
    789_Var_4  
0         NaN  
1       810.0  
2       529.0  
3       490.0  
4       490.0  
5       490.0  
6       544.0  
7       342.0  
8         NaN  
9       454.0  
10      454.0  
11      302.0  
12      302.0  
13        NaN  
14      406.0  
15      406.0  
16      414.0  
17      477.0  
18      477.0  
19      477.0  
20      321.0