Skip to content
Advertisement

Speed comparision for iterating over List and Generator in Python

When comparing usage of Python Generators vs List for better performance/ optimisation, i read that Generators are faster to create than list but iterating over list is faster than generator. But I coded an example to test it with small and big sample of data and it contradicts with one another.

When I test speed for iterating over generator and list using 1_000_000_000 where the actual generator will have 500,000,000 numbers. I see the result where Generator iteration is faster than list

from time import time

my_generator = (i for i in range(1_000_000_000) if i % 2 == 0)

start = time()
for i in my_generator:
    pass
print("Time for Generator iteration - ", time() - start)
my_list = [i for i in range(1_000_000_000) if i % 2 == 0]

start = time()
for i in my_list:
    pass
print("Time for List iteration - ", time() - start)

And the output is:

Time for Generator iteration -  67.49345350265503
Time for List iteration - 89.21837282180786

But if i use small chunk of data 10_000_000 instead of 1_000_000_000 in input, List iteration is faster than Generator.

from time import time

my_generator = (i for i in range(10_000_000) if i % 2 == 0)

start = time()
for i in my_generator:
    pass
print("Time for Generator iteration - ", time() - start)

my_list = [i for i in range(10_000_000) if i % 2 == 0]

start = time()
for i in my_list:
    pass
print("Time for list iteration - ", time() - start)

The output is:

Time for Generator iteration -  1.0233261585235596
Time for list iteration -  0.11701655387878418

Why is behaviour happening?

Advertisement

Answer

After understanding points made by @gimix and @Dani Mesejo, I found the answer. Indeed list iteration is faster than generator iteration

In case of generator, a generator is called like a function call for each iteration we are also calling reminder operation (modulus)for each iteration as it makes it even slower for each call…Whereas in case of list it is calculated during creation itself and iteration is faster. Thus creation of list might be slower than creation of generator but iteration of list is definitely faster than list

The above code uses time module which is not reliable!! Now I used timeit for 1_000_000 and for 1_000_000_000 data and in both cases list iteration was faster :

import timeit

mysetup = '''my_generator = (i for i in range(10_000_000) if i % 2 == 0)
'''

mycode = '''
for i in my_generator:
    pass
'''

mysetup1 = '''my_list = [i for i in range(10_000_000) if i % 2 == 0]'''

mycode1 = '''
for i in my_list:
    pass
'''
print (timeit.timeit(setup = mysetup,
                    stmt = mycode,
                     number = 1))
print (timeit.timeit(setup = mysetup1,
                    stmt = mycode1,
                     number = 1))
User contributions licensed under: CC BY-SA
4 People found this is helpful
Advertisement