I take my dataframe, which is in seconds, and resample it over a period of every n seconds, to properly align all values with even spacing.
The seconds are parsed correctly, but the output results are strange, so maybe I’m completely misunderstanding what exactly is being splined over?
JavaScript
x
35
35
1
import pandas as pd
2
from scipy.interpolate import interp1d
3
4
df = pd.DataFrame(
5
{
6
"time": [0., 1.1, 3.3, 4.4, 5.5, 7.7, 9.9, 10.0],
7
"floats": [0., 0.1, 0.2, 0.3, 0.4, 0.3, 0.2, 0.0],
8
"ints": [0, 1, 1, 1, 1, 0, 0, 1],
9
}
10
)
11
12
df["time"]=pd.to_timedelta(df["time"],unit="s")
13
14
df.set_index("time",inplace=True)
15
df_interpolated = df.resample("2s").interpolate("spline", order=1)
16
17
print("Input:")
18
print(df)
19
20
print("Output:")
21
print(df_interpolated)
22
23
f_data_int = interp1d(df.index.astype(int), df["ints"])
24
interpolated_int = f_data_int(df_interpolated.index.astype(int))
25
26
f_data_float = interp1d(df.index.astype(int), df["floats"])
27
interpolated_float = f_data_float(df_interpolated.index.astype(int))
28
29
df_fixed = df_interpolated.copy()
30
31
df_fixed["floats"] = interpolated_float
32
df_fixed["ints"] = interpolated_int#.astype(int)
33
print("Expected:")
34
print(df_fixed.round(2))
35
Gives
JavaScript
1
34
34
1
Input:
2
floats ints
3
time
4
0 days 00:00:00 0.0 0
5
0 days 00:00:01.100000 0.1 1
6
0 days 00:00:03.300000 0.2 1
7
0 days 00:00:04.400000 0.3 1
8
0 days 00:00:05.500000 0.4 1
9
0 days 00:00:07.700000 0.3 0
10
0 days 00:00:09.900000 0.2 0
11
0 days 00:00:10 0.0 1
12
13
14
Output:
15
floats ints
16
time
17
0 days 00:00:00 0.0 0.0
18
0 days 00:00:02 0.0 0.2
19
0 days 00:00:04 0.0 0.4
20
0 days 00:00:06 0.0 0.6
21
0 days 00:00:08 0.0 0.8
22
0 days 00:00:10 0.0 1.0
23
24
25
Expected:
26
floats ints
27
time
28
0 days 00:00:00 0.00 0.00
29
0 days 00:00:02 0.14 1.00
30
0 days 00:00:04 0.26 1.00
31
0 days 00:00:06 0.38 0.77
32
0 days 00:00:08 0.29 0.00
33
0 days 00:00:10 0.00 1.00
34
So where did my values go in the output?
Advertisement
Answer
When you resample, you lose a lot of data that does not fit the 2s timestep. Therefore, you can’t use it for the interpolation.
JavaScript
1
12
12
1
import datetime
2
3
# upsample with very small steps
4
# specify interpolation rule if the date does not fit to the timedelta pattern
5
# (for example, if your time is like 0.111 -- not a multiple of 0.1)
6
timedelta = datetime.timedelta(seconds=0.1)
7
# upsample and interpolate
8
df_interpolated = df.resample(timedelta, convention='end').interpolate("spline", order=1)
9
# resample to keep points only at 2s intervals. We don't have missing values, so
10
# None can be filled out by any method
11
df_interpolated = df_interpolated.resample('2s').asfreq()
12