Given previous datetime values in a Pandas DataFrame–either as an index or as values in a column–is there a way to “autofill” remaining time increments based on the previous fixed increments?
For example, given:
import pandas as pd import numpy as np df = pd.DataFrame({'B': [0, 1, 2, np.nan, 4]}, index = [pd.Timestamp('20130101 09:00:00'), pd.Timestamp('20130101 09:00:05'), pd.Timestamp('20130101 09:00:10'), np.nan, np.nan])
I would like to apply a function to yield:
B | |
---|---|
2013-01-01 09:00:00 | 0.0 |
2013-01-01 09:00:05 | 1.0 |
2013-01-01 09:00:10 | 2.0 |
2013-01-01 09:00:15 | NaN |
2013-01-01 09:00:20 | 4.0 |
Where I have missing timesteps for my last two data points. Here, timesteps are fixed in 5 second increments.
This will be for thousands of rows. While I might reset_index and then create a function to apply to each row, this seems cumbersome. Is there a slick or built-in way to do this that I’m not finding?
Advertisement
Answer
Assuming the first index value is a valid datetime and all the values are spaced 5s apart, you could do the following:
df.index = pd.date_range(df.index[0], periods=len(df), freq='5s') >>> df B 2013-01-01 09:00:00 0.0 2013-01-01 09:00:05 1.0 2013-01-01 09:00:10 2.0 2013-01-01 09:00:15 NaN 2013-01-01 09:00:20 4.0