Skip to content
Advertisement

Remove data time older than specific hours

I want to remove data from my dataframe older than say 2 hours from current time starting with 00 mins (datetime column is in index)

when i use below code

df = df[df.index > df.index.max() - pd.Timedelta(hours=2)]
| datetime            | value |
|---------------------|-------|
| 17-03-2022 15:05:00 | 78    |
| 17-03-2022 15:07:00 | 86    |
| 17-03-2022 15:57:00 | 77    |
| 17-03-2022 15:58:00 | 22    |
| 17-03-2022 15:59:00 | 10    |
| 17-03-2022 16:00:00 | 22    |
| 17-03-2022 16:01:00 | 25    |
| 17-03-2022 16:02:00 | 25    |
| 17-03-2022 17:05:00 | 34    |

Current datetime: ’17-03-2022 17:05:00′

Issue: My code keeps all records in df from ’17-03-2022 15:05:00′ to ’17-03-2022 17:05:00′

Requirement: All records in df from ’17-03-2022 15:00:00′ to ’17-03-2022 17:05:00′

It should start from 00th minute of -2hrs

| datetime            | value |
|---------------------|-------|
| 17-03-2022 15:00:00 | 18    |
| 17-03-2022 15:05:00 | 78    |
| 17-03-2022 15:07:00 | 86    |
| 17-03-2022 15:57:00 | 77    |
| 17-03-2022 15:58:00 | 22    |
| 17-03-2022 15:59:00 | 10    |
| 17-03-2022 16:00:00 | 22    |
| 17-03-2022 16:01:00 | 25    |
| 17-03-2022 16:02:00 | 25    |
| 17-03-2022 17:05:00 | 34    |

Advertisement

Answer

Use Timestamp.floor with change > to >=:

df = df[df.index >= (df.index.max() - pd.Timedelta(hours=2)).floor('H')]
print (df)
                     value
datetime                  
2022-03-17 15:00:00     18
2022-03-17 15:05:00     78
2022-03-17 15:07:00     86
2022-03-17 15:57:00     77
2022-03-17 15:58:00     22
2022-03-17 15:59:00     10
2022-03-17 16:00:00     22
2022-03-17 16:01:00     25
2022-03-17 16:02:00     25
2022-03-17 17:05:00     34
User contributions licensed under: CC BY-SA
8 People found this is helpful
Advertisement