I would like to create datetime
objects from a list of string timecodes like these. However, parse
interprets incorrectly for my use case.
from datetime import datetime from dateutil import parser timecodes = ['0:00', '0:01', '1:01', '10:01', '1:10:01'] dt = parser.parse(timecode) print(dt)
The list above comes from YouTube’s transcript timecodes. When copied from the site, they use a variable format to designate hours, minutes, and time, based on elapsed time:
0:00 # 0 minutes, 0 seconds 0:01 # 0 minutes, 1 seconds 1:01 # 1 minutes, 1 seconds 10:01 # 10 minutes, 1 seconds 1:10:01 # 1 hours, 10 minutes, 1 seconds
and parse
results in (comments are my interpretations):
2022-10-24 00:00:00 #0 minutes, 0 seconds 2022-10-24 00:01:00 #1 minutes, 0 seconds 2022-10-24 01:01:00 #1 hours, 1 minutes, 0 seconds 2022-10-24 10:01:00 #10 hours, 1 minutes, 0 seconds 2022-10-24 01:10:01 #1 hours, 10 minutes, 1 seconds
i.e. if a string doesn’t consist of a full timecode including hours, minutes, seconds, then parse
appears to think that minutes are hours, and seconds are minutes.
How can I either dynamically parse the list to default interpretation to minutes & seconds instead of hours & minutes, or alternatively adjust the timecodes intelligently so that they conform to the parse
format?
Advertisement
Answer
This is a little tricky but should work:
import datetime timecodes = ['0:00', '0:01', '1:01', '10:01', '1:10:01'] zeroes = ['0','0','0'] dt = [] for i in timecodes: sep = i.split(':') sep = zeroes[:3-len(sep)] + sep dt.append(str(datetime.timedelta(seconds = sum([int(s) * 60**(2-sep.index(s)) for s in sep]))))
Output:
dt = ['0:00:00', '0:00:01', '0:01:01', '0:10:01', '1:10:01']