So I am working with tick data and I am attempting to resample the dataframe to minute bars, but when resample is called the time series begins and ends the first instance that a tick exists. How would I resample this data such that the first and last times can be specified to a certain start and end time?
Edit here is some sample data.
JavaScript
x
5
1
df = pd.DataFrame(data={'Code': pd.Series(['A', 'A', 'B', 'B'], dtype='str'), 'Timestamp': pd.Series([1608627600073933, 1698929600124359, 1608627600073933, 1608929600124359], dtype='datetime64[ns]'),
2
'Val':[5, 6, 5, 6]})
3
df.set_index(['Timestamp'], inplace=True)
4
df.groupby('Code').resample('1T').agg('sum')
5
Which outputs
JavaScript
1
9
1
Val
2
Timestamp
3
1970-01-19 14:50:00 5
4
1970-01-19 14:51:00 0
5
1970-01-19 14:52:00 0
6
1970-01-19 14:53:00 0
7
1970-01-19 14:54:00 0
8
1970-01-19 14:55:00 6
9
But I would like an output dataframe that includes a timestamp for every minute of a specific hour for example.
Advertisement
Answer
You can add start and end datetimes manually:
JavaScript
1
17
17
1
#removed minutes and seconds
2
df1 = df.rename(lambda x: x.floor('H'))
3
#removed duplicated DatetimeIndex - output empty df
4
df1 = df1.loc[~df1.index.duplicated(), []]
5
#join together
6
df1 = pd.concat([df, df1, df1.rename(lambda x: x + pd.Timedelta('00:59:00'))])
7
print (df1)
8
Code Val
9
Timestamp
10
1970-01-19 14:50:27.600073933 A 5.0
11
1970-01-19 14:55:29.600124359 A 6.0
12
1970-01-19 14:00:00.000000000 NaN NaN
13
1970-01-19 14:59:00.000000000 NaN NaN
14
15
df2 = df1.resample('1T').agg('sum')
16
print (df2)
17
For add values per days:
JavaScript
1
14
14
1
df1 = df.rename(lambda x: x.floor('D'))
2
df1 = df1.loc[~df1.index.duplicated(), []]
3
df1 = pd.concat([df, df1, df1.rename(lambda x: x + pd.Timedelta('23:59:00'))])
4
print (df1)
5
Code Val
6
Timestamp
7
1970-01-19 14:50:27.600073933 A 5.0
8
1970-01-19 14:55:29.600124359 A 6.0
9
1970-01-19 00:00:00.000000000 NaN NaN
10
1970-01-19 23:59:00.000000000 NaN NaN
11
12
df2 = df1.resample('1T').agg('sum')
13
print (df2)
14