import time from concurrent.futures import ThreadPoolExecutor class Sandbox: def __init__(self, n=5): self.n = n self.executor = ThreadPoolExecutor(n) def make_request(self): if self.executor._work_queue.qsize() < self.n: self.executor.submit(self.do_something_that_takes_long) print('HTTP_202') else: print('HTTP_429') def do_something_that_takes_long(self): time.sleep(10) def do_ok_situation(): s = Sandbox() for _ in range(5): s.make_request() def do_bad_situation(): s = Sandbox() for _ in range(100): s.make_request() # do_ok_situation() do_bad_situation()
This will output
HTTP_202 HTTP_202 HTTP_202 HTTP_202 HTTP_202 HTTP_202 HTTP_202 HTTP_202 HTTP_202 HTTP_202 HTTP_429 HTTP_429 HTTP_429 HTTP_429 ...
This code will output 10 HTTP_200’s (on my machine) instead 5. I expected that number of requests I make to the executor is equal to the number of jobs put into the thread executor queue.
Why is this the case? How can I limit this number to the number of max workers?
Advertisement
Answer
It appears that self.executor._work_queue.qsize()
returns the number of requests that are sitting in the work_queue
waiting for a thread to execute them. However when you call submit()
there is often an idle thread in the thread pool that is immediately available to handle the request, so for the first five calls to make_request()
, the request doesn’t go into the work_queue
at all, but rather gets handed directly to a thread to execute.
You can demonstrate this behavior to yourself by adding a line like
print("qSize=%i"%self.executor._work_queue.qsize())
to the front of your make_request()
method; you’ll see that qSize
remains 0 for the first 5 calls, and only starts growing larger after all 5 threads in the ThreadPool are already busy executing something_that_takes_long
and therefore the additional requests go into the queue instead.