For some performance critical code, I am doing extremely basic performance tests to figure out what’s slow and what’s fast. Please excuse my terrible timing method, as I have basically no idea what I’m doing. Consider these two functions :
>>> def testOneBillion():
a = 0
print(f'[{datetime.datetime.now()}] Start')
for i in range(1_000_000_000):
a = i
print(f'[{datetime.datetime.now()}] End')
>>> testOneBillion()
[2021-04-17 17:25:19.126744] Start
[2021-04-17 17:25:33.370225] End
and
>>> def testTenBillion():
a = 0
print(f'[{datetime.datetime.now()}] Start')
for i in range(10_000_000_000):
a = i
print(f'[{datetime.datetime.now()}] End')
>>> testTenBillion()
[2021-04-17 17:26:10.545044] Start
[2021-04-17 17:37:01.154828] End
Does anyone know why testTenBillion
is taking more than ten times longer than testOneBillion
to finish ? I would have expected linear performance scaling, and every previous range increment did behave linearly. What could be causing this ?
Advertisement
Answer
The comment under your question might be correct. If I understand it correctly, Python’s standard integer format can contain numbers up to 2.147483647 billion.
Source: https://python-reference.readthedocs.io/en/latest/docs/ints/
EDIT: I tested the loop with 2 billion and it took only 2.1x longer than 1 billion which lends credibility to this theory.