For some performance critical code, I am doing extremely basic performance tests to figure out what’s slow and what’s fast. Please excuse my terrible timing method, as I have basically no idea what I’m doing. Consider these two functions :
>>> def testOneBillion(): a = 0 print(f'[{datetime.datetime.now()}] Start') for i in range(1_000_000_000): a = i print(f'[{datetime.datetime.now()}] End') >>> testOneBillion() [2021-04-17 17:25:19.126744] Start [2021-04-17 17:25:33.370225] End
and
>>> def testTenBillion(): a = 0 print(f'[{datetime.datetime.now()}] Start') for i in range(10_000_000_000): a = i print(f'[{datetime.datetime.now()}] End') >>> testTenBillion() [2021-04-17 17:26:10.545044] Start [2021-04-17 17:37:01.154828] End
Does anyone know why testTenBillion
is taking more than ten times longer than testOneBillion
to finish ? I would have expected linear performance scaling, and every previous range increment did behave linearly. What could be causing this ?
Advertisement
Answer
The comment under your question might be correct. If I understand it correctly, Python’s standard integer format can contain numbers up to 2.147483647 billion.
Source: https://python-reference.readthedocs.io/en/latest/docs/ints/
EDIT: I tested the loop with 2 billion and it took only 2.1x longer than 1 billion which lends credibility to this theory.