resample/interpolate time series with datetimeindex

I have two dataframes each containing one or more time series from the same time frame but sampled at different timestamps.

I’d like to merge them into a single one resampled and interpolated with the index of the first.

Here’s a sample of the first dataframe:

                        a      b      c     d
2013-01-01 07:00:00  0.45  24.33   9.04  0.00
2013-01-01 08:00:00  0.55  23.11  11.60  0.06
2013-01-01 09:00:00  0.69  27.23  18.18  0.03
2013-01-01 10:00:00  0.64  26.58  31.46  0.06
2013-01-01 11:00:00  0.36  17.50  42.58  0.29
2013-01-01 12:00:00  0.32  15.39  50.30  0.17
2013-01-01 13:00:00  0.41  17.73  51.45  0.13
2013-01-01 14:00:00  0.50  19.48  50.50  0.05
2013-01-01 15:00:00  0.48  18.32  51.51  0.03
2013-01-01 16:00:00  0.50  18.49  50.70  0.02
2013-01-01 17:00:00  1.13  32.89  40.07  0.20
2013-01-01 18:00:00  1.81  59.64  16.59  0.37

JavaScript
​x
 
                        a      b      c     d
2013-01-01 07:00:00  0.45  24.33   9.04  0.00
2013-01-01 08:00:00  0.55  23.11  11.60  0.06
2013-01-01 09:00:00  0.69  27.23  18.18  0.03
2013-01-01 10:00:00  0.64  26.58  31.46  0.06
2013-01-01 11:00:00  0.36  17.50  42.58  0.29
2013-01-01 12:00:00  0.32  15.39  50.30  0.17
2013-01-01 13:00:00  0.41  17.73  51.45  0.13
2013-01-01 14:00:00  0.50  19.48  50.50  0.05
2013-01-01 15:00:00  0.48  18.32  51.51  0.03
2013-01-01 16:00:00  0.50  18.49  50.70  0.02
2013-01-01 17:00:00  1.13  32.89  40.07  0.20
2013-01-01 18:00:00  1.81  59.64  16.59  0.37
​

And the second one:

                        e
2013-01-01 06:15:00   9.0
2013-01-01 06:45:00   9.0
2013-01-01 06:55:00   9.0
2013-01-01 07:15:00   9.0
2013-01-01 07:45:00   9.0
2013-01-01 07:55:00   9.0
2013-01-01 08:15:00  10.0
2013-01-01 08:45:00  11.0
2013-01-01 08:55:00  11.0
2013-01-01 09:15:00  12.0
2013-01-01 09:45:00  13.0
2013-01-01 09:55:00  13.0
2013-01-01 10:15:00  14.0
2013-01-01 10:45:00  15.0
2013-01-01 10:55:00  15.0
2013-01-01 11:15:00  14.0
2013-01-01 11:45:00  14.0
2013-01-01 11:55:00  14.0
2013-01-01 12:15:00  14.0
2013-01-01 12:45:00  14.0
2013-01-01 12:55:00  14.0
2013-01-01 13:15:00  14.0
2013-01-01 13:45:00  14.0
2013-01-01 13:55:00  14.0
2013-01-01 14:15:00  14.0
2013-01-01 14:45:00  14.0
2013-01-01 14:55:00  14.0
2013-01-01 15:15:00  14.0
2013-01-01 15:45:00  13.0
2013-01-01 15:55:00  13.0
2013-01-01 16:15:00  13.0
2013-01-01 16:45:00  13.0
2013-01-01 16:55:00  13.0
2013-01-01 17:15:00  12.0
2013-01-01 17:45:00  12.0
2013-01-01 17:55:00  12.0
2013-01-01 18:15:00  11.0

JavaScript
 
                        e
2013-01-01 06:15:00   9.0
2013-01-01 06:45:00   9.0
2013-01-01 06:55:00   9.0
2013-01-01 07:15:00   9.0
2013-01-01 07:45:00   9.0
2013-01-01 07:55:00   9.0
2013-01-01 08:15:00  10.0
2013-01-01 08:45:00  11.0
2013-01-01 08:55:00  11.0
2013-01-01 09:15:00  12.0
2013-01-01 09:45:00  13.0
2013-01-01 09:55:00  13.0
2013-01-01 10:15:00  14.0
2013-01-01 10:45:00  15.0
2013-01-01 10:55:00  15.0
2013-01-01 11:15:00  14.0
2013-01-01 11:45:00  14.0
2013-01-01 11:55:00  14.0
2013-01-01 12:15:00  14.0
2013-01-01 12:45:00  14.0
2013-01-01 12:55:00  14.0
2013-01-01 13:15:00  14.0
2013-01-01 13:45:00  14.0
2013-01-01 13:55:00  14.0
2013-01-01 14:15:00  14.0
2013-01-01 14:45:00  14.0
2013-01-01 14:55:00  14.0
2013-01-01 15:15:00  14.0
2013-01-01 15:45:00  13.0
2013-01-01 15:55:00  13.0
2013-01-01 16:15:00  13.0
2013-01-01 16:45:00  13.0
2013-01-01 16:55:00  13.0
2013-01-01 17:15:00  12.0
2013-01-01 17:45:00  12.0
2013-01-01 17:55:00  12.0
2013-01-01 18:15:00  11.0
​

In this case the second one is more granular but that won’t be necessarily the case. I’d like to resample the second one with dates from the first. Is this possible in an elegant pandas way?

I tried reindex with the full dataframes but it complains about duplicate axis. Maybe that’s really my issue.

Answer

A simple new_df = pd.concat((df1,df2), axis=1) retains all information and timestamps. You can choose to resample new_df as wished.

In this specific case, you can do:

pd.concat((df1, df2.groupby(df2.index.floor('H')).mean()), axis=1)

JavaScript
 
pd.concat((df1, df2.groupby(df2.index.floor('H')).mean()), axis=1)
​

Output:

                        a       b       c       d       e
idx                     
2013-01-01 06:00:00     NaN     NaN     NaN     NaN     9.000000
2013-01-01 07:00:00     0.45    24.33   9.04    0.00    9.000000
2013-01-01 08:00:00     0.55    23.11   11.60   0.06    10.666667
2013-01-01 09:00:00     0.69    27.23   18.18   0.03    12.666667
2013-01-01 10:00:00     0.64    26.58   31.46   0.06    14.666667
2013-01-01 11:00:00     0.36    17.50   42.58   0.29    14.000000
2013-01-01 12:00:00     0.32    15.39   50.30   0.17    14.000000
2013-01-01 13:00:00     0.41    17.73   51.45   0.13    14.000000
2013-01-01 14:00:00     0.50    19.48   50.50   0.05    14.000000
2013-01-01 15:00:00     0.48    18.32   51.51   0.03    13.333333
2013-01-01 16:00:00     0.50    18.49   50.70   0.02    13.000000
2013-01-01 17:00:00     1.13    32.89   40.07   0.20    12.000000
2013-01-01 18:00:00     1.81    59.64   16.59   0.37    11.000000

JavaScript
 
                        a       b       c       d       e
idx                     
2013-01-01 06:00:00     NaN     NaN     NaN     NaN     9.000000
2013-01-01 07:00:00     0.45    24.33   9.04    0.00    9.000000
2013-01-01 08:00:00     0.55    23.11   11.60   0.06    10.666667
2013-01-01 09:00:00     0.69    27.23   18.18   0.03    12.666667
2013-01-01 10:00:00     0.64    26.58   31.46   0.06    14.666667
2013-01-01 11:00:00     0.36    17.50   42.58   0.29    14.000000
2013-01-01 12:00:00     0.32    15.39   50.30   0.17    14.000000
2013-01-01 13:00:00     0.41    17.73   51.45   0.13    14.000000
2013-01-01 14:00:00     0.50    19.48   50.50   0.05    14.000000
2013-01-01 15:00:00     0.48    18.32   51.51   0.03    13.333333
2013-01-01 16:00:00     0.50    18.49   50.70   0.02    13.000000
2013-01-01 17:00:00     1.13    32.89   40.07   0.20    12.000000
2013-01-01 18:00:00     1.81    59.64   16.59   0.37    11.000000
​

Advertisement

Answer