Given previous datetime values in a Pandas DataFrame–either as an index or as values in a column–is there a way to “autofill” remaining time increments based on the previous fixed increments?
For example, given:
import pandas as pd
import numpy as np
df = pd.DataFrame({'B': [0, 1, 2, np.nan, 4]},
index = [pd.Timestamp('20130101 09:00:00'),
pd.Timestamp('20130101 09:00:05'),
pd.Timestamp('20130101 09:00:10'),
np.nan,
np.nan])
I would like to apply a function to yield:
| B | |
|---|---|
| 2013-01-01 09:00:00 | 0.0 |
| 2013-01-01 09:00:05 | 1.0 |
| 2013-01-01 09:00:10 | 2.0 |
| 2013-01-01 09:00:15 | NaN |
| 2013-01-01 09:00:20 | 4.0 |
Where I have missing timesteps for my last two data points. Here, timesteps are fixed in 5 second increments.
This will be for thousands of rows. While I might reset_index and then create a function to apply to each row, this seems cumbersome. Is there a slick or built-in way to do this that I’m not finding?
Advertisement
Answer
Assuming the first index value is a valid datetime and all the values are spaced 5s apart, you could do the following:
df.index = pd.date_range(df.index[0], periods=len(df), freq='5s')
>>> df
B
2013-01-01 09:00:00 0.0
2013-01-01 09:00:05 1.0
2013-01-01 09:00:10 2.0
2013-01-01 09:00:15 NaN
2013-01-01 09:00:20 4.0