Skip to content
Advertisement

Split a generator into chunks without pre-walking it

(This question is related to this one and this one, but those are pre-walking the generator, which is exactly what I want to avoid)

I would like to split a generator in chunks. The requirements are:

  • do not pad the chunks: if the number of remaining elements is less than the chunk size, the last chunk must be smaller.
  • do not walk the generator beforehand: computing the elements is expensive, and it must only be done by the consuming function, not by the chunker
  • which means, of course: do not accumulate in memory (no lists)

I have tried the following code:

JavaScript

And this somehow works:

JavaScript

Buuuut … it never stops (I have to press ^C) because of the while True. I would like to stop that loop whenever the generator has been consumed, but I do not know how to detect that situation. I have tried raising an Exception:

JavaScript

But then the exception is only raised in the context of the consumer, which is not what I want (I want to keep the consumer code clean)

JavaScript

How can I detect that the generator is exhausted in the chunks function, without walking it?

Advertisement

Answer

One way would be to peek at the first element, if any, and then create and return the actual generator.

JavaScript

Just use this in your chunk generator and catch the StopIteration exception like you did with your custom exception.


Update: Here’s another version, using itertools.islice to replace most of the head function, and a for loop. This simple for loop in fact does exactly the same thing as that unwieldy while-try-next-except-break construct in the original code, so the result is much more readable.

JavaScript

And we can get even shorter than that, using itertools.chain to replace the inner generator:

JavaScript
User contributions licensed under: CC BY-SA
7 People found this is helpful
Advertisement