Suppose I have the following function:
def print_twice(x): for i in x: print(i) for i in x: print(i)
When I run:
print_twice([1,2,3])
or:
print_twice((1,2,3))
I get the expected result: the numbers 1,2,3 are printed twice.
But when I run:
print_twice(zip([1,2,3],[4,5,6]))
the pairs (1,4),(2,5),(3,6) are printed only once. Probably, this is because the zip
returns a generator that terminates after one pass.
How can I modify the function print_twice
such that it will correctly handle all inputs?
I could insert a line at the beginning of the function: x = list(x)
. But this might be inefficient in case x is already a list, a tuple, a range, or any other iterator that can be iterated more than once. Is there a more efficient solution?
Advertisement
Answer
I could insert a line at the beginning of the function:
x = list(x)
. But this might be inefficient in case x is already a list, a tuple, a range, or any other iterator that can be iterated more than once. Is there a more efficient solution?
Copying single-use iterables to a list
is perfectly adequate, and reasonably efficient even for multi-use iterables.
The list
(and to some extend tuple
) type is one of the most optimised data structures in Python. Common operations such as copying a list
or tuple
to a list
are internally optimised;1 even for iterables that are not special-cased, copying them to a list
is significantly faster than any realistic work done by two (or more) loops.
def print_twice(x): x = list(x) for i in x: print(i) for i in x: print(i)
Copying indiscriminately can also be advantageous in the context of concurrency, when the iterable may be modified while the function is running. Common cases are threading and weakref
collections.
In case one wants to avoid needless copies, checking whether the iterable is a Collection
is a reasonable guard.
from collections.abc import Collection x = list(x) if not isinstance(x, Collection) else x
Alternatively, one can check whether the iterable is in fact an iterator, since this implies statefulness and thus single-use.
from collections.abc import Iterator x = list(x) if isinstance(x, Iterator) else x x = list(x) if iter(x) is x else x
Notably, the builtins zip
, filter
, map
, … and generators all are iterators.
1Copying a list
of 128 items is roughly as fast as checking whether it is a Collection
.