Detecting cheapest way to build independent iterators

Question

Suppose I'm writing a function taking in an iterable, and my function wants to be agnostic as to whether that iterable is actually an iterator yet or not. (This is a common situation, right? I think basically all the itertools functions are written this way. Take in an iterable, return an iterator.) If I call, for instance, itertools.tee(•, 2) on

Accepted Answer

Observe:>>> def foo(x):... return x.__iter__() # or return iter(x)...>>> l = [0, 1]>>> it = l.__iter__()>>> it>>> print(foo(l), foo(it)) So you do not need to worry whether the argument to your function is an iterable or already an iterator. You can call method __iter__ on something that is already an iterator and it just returns self in that case. This is not an expensive call and would be cheaper than anything you could possibly do to test to see if it is an iterator, such as whether it has a __next__ method (and then having to call __iter__ on it anyway if it doesn’t).UpdateWe now see that there is a bit difference in passing to your function an iterable vs passing an iterator (depending on how the iterator is written, of course) since calling iter twice on the former will give you two distinct iterators while calling iter twice on the latter will not. itertools.tee, as an example, is expecting an iterable. If you pass it an iterator that implements __iter__ that returns ‘selfit will clearly work sincetee` does not need two independent iterators for it to do its magic.But if you are writing an iterator that is passed an iterable that is implemented by internally using two or more iterators on the passed iterator, what you really want to be testing for is whether what is being passed is something that support multiple, concurrent, independent iterations regardless of whether it is an iterator or just a plain iterator:def my_iterator(iterable): it1 = iter(iterable) it2 = iter(iterable) if it1 is it2: raise ValueError('The passed iterable does not support multiple, concurrent, independent iterations.') ...class Foo: def __init__(self, lst): self.lst = lst def __iter__(self): self.idx = 0 return self def __next__(self): if self.idx < len(self.lst): value = self.lst[self.idx] self.idx += 1 return value raise StopIteration()f = Foo("abcd")for x in f: print(x)my_iterator(f)Prints:abcdTraceback (most recent call last): File "C:Boobootesttest.py", line 26, in my_iterator(f) File "C:Boobootesttest.py", line 5, in my_iterator raise ValueError('The passed iterable does not support multiple, concurrent, independent iterations.')ValueError: The passed iterable does not support multiple, concurrent, independent iterations.The writer of the original, passed iterator must write it in such a way that it supports multiple, concurrent, independent iterations.

Advertisement

Answer