How to deoptimze memory access in python?

Question

This may not useful. It's just a challenge I have set up for myself. Let's say you have a big array. What can you do so that the program does not benefit from caching, cache line prefetching or the fact that the next memory access can only be determined after the first access finishes. So we have our array: array

Accepted Answer

I did not expect any difference, but in fact accessing the digits in random order is significantly slower than accessing them in order or in reverse order (which is both about the same).>>> N = 10**5>>> arr = [random.randint(0, 1000) for _ in range(N)]>>> srt = list(range(N))>>> rvd = srt[::-1]>>> rnd = random.sample(srt, N)>>> %timeit sum(arr[i] for i in srt)10 loops, best of 5: 24.9 ms per loop>>> %timeit sum(arr[i] for i in rvd)10 loops, best of 5: 25.7 ms per loop>>> %timeit sum(arr[i] for i in rnd)10 loops, best of 5: 59.2 ms per loopAnd it really seems to be the randomness. Just accessing indices out of order, but with a pattern, e.g. as [0, N-1, 2, N-3, ...] or [0, N/2, 1, N/2+1, ...], is just as fast as accessing them in order:>>> alt1 = [i if i % 2 == 0 else N - i for i in range(N)]>>> alt2 = [i for p in zip(srt[:N//2], srt[N//2:]) for i in p]>>> %timeit sum(arr[i] for i in alt1)10 loops, best of 5: 24.5 ms per loop>>> %timeit sum(arr[i] for i in alt2)10 loops, best of 5: 24.1 ms per loopInterestingly, just iterating the shuffled indices (and calculating their sum as with the array above) is also slower than doing the same with the sorted indices, but not as much. Of the ~35ms difference between srt and rnd, ~10ms seem to come from iterating the randomized indices, and ~25ms for actually accessing the indices in random order.>>> %timeit sum(i for i in srt)100 loops, best of 5: 19.7 ms per loop>>> %timeit sum(i for i in rnd)10 loops, best of 5: 30.5 ms per loop>>> %timeit sum(arr[i] for i in srt)10 loops, best of 5: 24.5 ms per loop>>> %timeit sum(arr[i] for i in rnd)10 loops, best of 5: 56 ms per loop(IPython 5.8.0 / Python 3.7.3 on a rather old laptop running Linux)

Advertisement

Answer