I have code which reads vast numbers of dates in ‘YYYY-MM-DD’ format. Parsing all these dates, so that it can add one, two, or three days then write back in the same format is slowing things down quite considerably.
3214657 14.330 0.000 103.698 0.000 trade.py:56(effective) 3218418 34.757 0.000 66.155 0.000 _strptime.py:295(_strptime) day = datetime.datetime.strptime(endofdaydate, "%Y-%m-%d").date()
Any suggestions how to speed it up a bit (or a lot)?
Advertisement
Answer
Python 3.7+: fromisoformat()
Since Python 3.7, the datetime
class has a method fromisoformat
. It should be noted that this can also be applied to this question:
Performance vs. strptime()
Explicit string slicing may give you about a 9x increase in performance compared to normal strptime
, but you can get about a 90x increase with the built-in fromisoformat
method!
%timeit isofmt(datelist) 569 µs ± 8.45 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each) %timeit slice2int(datelist) 5.51 ms ± 48.9 µs per loop (mean ± std. dev. of 7 runs, 100 loops each) %timeit normalstrptime(datelist) 52.1 ms ± 1.27 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
from datetime import datetime, timedelta base, n = datetime(2000, 1, 1, 1, 2, 3, 420001), 10000 datelist = [(base + timedelta(days=i)).strftime('%Y-%m-%d') for i in range(n)] def isofmt(l): return list(map(datetime.fromisoformat, l)) def slice2int(l): def slicer(t): return datetime(int(t[:4]), int(t[5:7]), int(t[8:10])) return list(map(slicer, l)) def normalstrptime(l): return [datetime.strptime(t, '%Y-%m-%d') for t in l] print(isofmt(datelist[0:1])) print(slice2int(datelist[0:1])) print(normalstrptime(datelist[0:1])) # [datetime.datetime(2000, 1, 1, 0, 0)] # [datetime.datetime(2000, 1, 1, 0, 0)] # [datetime.datetime(2000, 1, 1, 0, 0)]
Python 3.8.3rc1 x64 / Win10