I’m trying to convert HTTP timestamps into standard timestamp for complete data framing and getting time-series plots. I’m looking for an efficient way to do this for the large dataset. My actual data frame is as follows:
# +----------------------+-------+ # | timestamp | col1 | # +----------------------+-------+ # | 10/May/2021:19:48:36 | 714 | # | 10/May/2021:00:28:12 | None | # | 10/May/2021:15:34:26 | 634 | # | 10/May/2021:00:20:25 | 8453 | # | 10/May/2021:14:31:32 | None | # +---------------------+-------+
I have tried couple of the following methods and get errors:
df["timestamp"] = pd.to_datetime(df['timestamp'], errors='coerce', format='%Y-%m-%d %H:%M:%S') df["timestamp"]
This returns me NaT which is strange!
0         NaT
1         NaT
2         NaT
3         NaT
4         NaT
           ..
I updated the format and change it from format='%Y-%m-%d %H:%M:%S' to format='%d/%b/%Y:%H:%M:%S' and worked and I can use df.plot(x='timestamp', y='col1'). Additionally I don’t see any changes/formatting in timestamps in plot:

The problem is when I plot it, timestamps related xticks in x axis will be plotted as they are but I couldn’t limited time stamps to hours in the same day. This is not practical for my plot when events are belong to a same day. I tried to access/filter hours by this:
df['timestamp'].dt.hour but it returns this error: AttributeError: Can only use .dt accessor with datetimelike values I searched this error and checks previous posts in this regard but didn’t work for my case like this answer.
Any help will be highly appreciated
Advertisement
Answer
Try via split(), join() and to_datetime() method:
df['timestamp']=pd.to_datetime(df['timestamp'].str.split(':',1).str.join(' '),errors='coerce')
output of df:
timestamp col1 0 2021-05-10 19:48:36 714 1 2021-05-10 00:28:12 None 2 2021-05-10 15:34:26 634 3 2021-05-10 00:20:25 8453 4 2021-05-10 14:31:32 None