I have two dataframes each containing one or more time series from the same time frame but sampled at different timestamps.
I’d like to merge them into a single one resampled and interpolated with the index of the first.
Here’s a sample of the first dataframe:
JavaScript
x
14
14
1
a b c d
2
2013-01-01 07:00:00 0.45 24.33 9.04 0.00
3
2013-01-01 08:00:00 0.55 23.11 11.60 0.06
4
2013-01-01 09:00:00 0.69 27.23 18.18 0.03
5
2013-01-01 10:00:00 0.64 26.58 31.46 0.06
6
2013-01-01 11:00:00 0.36 17.50 42.58 0.29
7
2013-01-01 12:00:00 0.32 15.39 50.30 0.17
8
2013-01-01 13:00:00 0.41 17.73 51.45 0.13
9
2013-01-01 14:00:00 0.50 19.48 50.50 0.05
10
2013-01-01 15:00:00 0.48 18.32 51.51 0.03
11
2013-01-01 16:00:00 0.50 18.49 50.70 0.02
12
2013-01-01 17:00:00 1.13 32.89 40.07 0.20
13
2013-01-01 18:00:00 1.81 59.64 16.59 0.37
14
And the second one:
JavaScript
1
39
39
1
e
2
2013-01-01 06:15:00 9.0
3
2013-01-01 06:45:00 9.0
4
2013-01-01 06:55:00 9.0
5
2013-01-01 07:15:00 9.0
6
2013-01-01 07:45:00 9.0
7
2013-01-01 07:55:00 9.0
8
2013-01-01 08:15:00 10.0
9
2013-01-01 08:45:00 11.0
10
2013-01-01 08:55:00 11.0
11
2013-01-01 09:15:00 12.0
12
2013-01-01 09:45:00 13.0
13
2013-01-01 09:55:00 13.0
14
2013-01-01 10:15:00 14.0
15
2013-01-01 10:45:00 15.0
16
2013-01-01 10:55:00 15.0
17
2013-01-01 11:15:00 14.0
18
2013-01-01 11:45:00 14.0
19
2013-01-01 11:55:00 14.0
20
2013-01-01 12:15:00 14.0
21
2013-01-01 12:45:00 14.0
22
2013-01-01 12:55:00 14.0
23
2013-01-01 13:15:00 14.0
24
2013-01-01 13:45:00 14.0
25
2013-01-01 13:55:00 14.0
26
2013-01-01 14:15:00 14.0
27
2013-01-01 14:45:00 14.0
28
2013-01-01 14:55:00 14.0
29
2013-01-01 15:15:00 14.0
30
2013-01-01 15:45:00 13.0
31
2013-01-01 15:55:00 13.0
32
2013-01-01 16:15:00 13.0
33
2013-01-01 16:45:00 13.0
34
2013-01-01 16:55:00 13.0
35
2013-01-01 17:15:00 12.0
36
2013-01-01 17:45:00 12.0
37
2013-01-01 17:55:00 12.0
38
2013-01-01 18:15:00 11.0
39
In this case the second one is more granular but that won’t be necessarily the case. I’d like to resample the second one with dates from the first. Is this possible in an elegant pandas way?
I tried reindex
with the full dataframes but it complains about duplicate axis. Maybe that’s really my issue.
Advertisement
Answer
A simple new_df = pd.concat((df1,df2), axis=1)
retains all information and timestamps. You can choose to resample new_df
as wished.
In this specific case, you can do:
JavaScript
1
2
1
pd.concat((df1, df2.groupby(df2.index.floor('H')).mean()), axis=1)
2
Output:
JavaScript
1
16
16
1
a b c d e
2
idx
3
2013-01-01 06:00:00 NaN NaN NaN NaN 9.000000
4
2013-01-01 07:00:00 0.45 24.33 9.04 0.00 9.000000
5
2013-01-01 08:00:00 0.55 23.11 11.60 0.06 10.666667
6
2013-01-01 09:00:00 0.69 27.23 18.18 0.03 12.666667
7
2013-01-01 10:00:00 0.64 26.58 31.46 0.06 14.666667
8
2013-01-01 11:00:00 0.36 17.50 42.58 0.29 14.000000
9
2013-01-01 12:00:00 0.32 15.39 50.30 0.17 14.000000
10
2013-01-01 13:00:00 0.41 17.73 51.45 0.13 14.000000
11
2013-01-01 14:00:00 0.50 19.48 50.50 0.05 14.000000
12
2013-01-01 15:00:00 0.48 18.32 51.51 0.03 13.333333
13
2013-01-01 16:00:00 0.50 18.49 50.70 0.02 13.000000
14
2013-01-01 17:00:00 1.13 32.89 40.07 0.20 12.000000
15
2013-01-01 18:00:00 1.81 59.64 16.59 0.37 11.000000
16