The tasks from asyncio.gather does not work concurrently

Question

I want to scrape data from a website concurrently, but I found that the following program is NOT executed concurrently. However, this program starts to download the second content only after the first one finishes. If my understanding is correct, the await keyword on the await return_soup(url) awaits for the …

Accepted Answer

Using asyncio is different from using threads in that you cannot add it to an existing code base to make it concurrent. Specifically, code that runs in the asyncio event loop must not block &#8211; all blocking calls must be replaced with non-blocking versions that yield control to the event loop. In your case, requests.get blocks and defeats the parallelism implemented by asyncio.To avoid this problem, you need to use an http library that is written with asyncio in mind, such as aiohttp.

Advertisement

Answer