Skip to content
Advertisement

One Queue for each Consumer – Python

I’m in a single-producer/multiple-consumers scenario. Consider that each job is independent and the consumers do not communicate among them.

  1. Could it be a good idea to create a different queue for each consumer? In that way, the producer adds jobs in each queue in a round-robin fashion and there are no delays in accessing a single queue.

  2. Or is it better to minimize the number of queues as much as possible?

  3. In the case of a single Queue and lots of consumers (like 20 or more), is the delay due to the synchronization access to the queue relevant?

I’m using Python 3.7 and multithreading/multiprocessing to create several Consumers. Each Consumer needs to run an executable and perform some I/O operation (write, move or copy files). I’ve currently developed it with multiprocessing and single queue, but I’m thinking to change the approach to multithreading and multiple queues.

Single Queue

                          Consumer
                        /
                       /  ..
Producer --> [ Queue ] -- Consumer
                         ..
                        
                          Consumer

Multiple Queue

               -> [ Queue ] -- Consumer
             / 
            / ..
Producer ----- -> [ Queue ] -- Consumer
             ..
             
               -> [ Queue ] -- Consumer

Advertisement

Answer

Specifically in the case of one producer -> many consumers, the benefit is to only have 1 Queue that the producer has to connect to and you can spin up as many consumers as you want to process ‘the next item’. Because Python has a very complicated relationship with Threading, I would recommend to use asyncio with asyncio.Queue. It is very intuitive and easy to use.

I recently brushed up on this topic and I found this gist very helpful in understanding how it works.

In any case, having more Queues will probably not speed up your processing. This could only be the case if (time to process message) < (get message from queue), which is not the case in your use case (with IO tasks etc).

User contributions licensed under: CC BY-SA
2 People found this is helpful
Advertisement