Skip to content
Advertisement

asyncio python coroutine cancelled while task still pending reading from redis channel

I have multiple couroutines each of which waits for content in a queue to start processing.

The content for the queues is populated by channel subscribers whose job is only to receive messages a push an item in the appropriate queue.

After the data is consumed by one queue processor and new data is generated it’s dispatched to the appropriate message channel where this process is repeated until the data is ready to be relayed to an api that provisions it.

JavaScript

What I’m noticing is that after a 24h run I’m getting these errors:

JavaScript
JavaScript

Which I’m not sure on how to interpret, resolve or recover from, my assumption is that first I should probably switch to redis streams instead of using channels and queues.

But, going back to this scenarios I have channel subscribers on different processes while the consumer run in the same process as different tasks in the loop.

What I’m assuming is happening here is that since the consumer is basically polling a queue at some point the connection pool manager or redis itself eventually starts hanging up on the connection open of the consumer and it gets cancelled.

Cause I’m not seeing any further message from that queue processor, but I also see that wait_for_future which I’m uncertain it may come from the subscriber ensure_future on the message reader

JavaScript

I could use some help to sort this out and properly understand what’s happening under the hood. thanks

Advertisement

Answer

Took a few days to solve just to reproduce the issue, I’ve found people with the same problem in the issues for the aioredis github repo.

So I had to go through all the connection open/close with redis to be sure added:

JavaScript

I also proceeded to improve the exception management in the consumer:

JavaScript

and the same for the producer:

JavaScript

so there are never connections left open with redis that might end up killing one of the processor’s coroutines as time goes by.

For me it solved the issue, since at the time I’m writing this it has been more than 2 weeks uptime with no more reported accidents of the same kind.

Anyway, there is also a new aioredis major release, it’s really recent news (this was on 1.3.1 and 2.0.0 should work using the same model as redis-py, so things have changed as well by this time).

User contributions licensed under: CC BY-SA
3 People found this is helpful
Advertisement