Advertisement

Separating async requests and saving using aiohttp

aiohttp async-await asynchronous python python-asyncio

Jack Moody

asked 21 Dec, 2020

I am currently calling an external API many times and downloading the response’s content from each call. I am using aiohttp and asyncio to speed up this process, but am having trouble figuring out how to separate the fetch functionality from the save functionality.

Setup

import asyncio
import os

from aiohttp import ClientSession

JavaScript
​x
 
import asyncio
import os
​
from aiohttp import ClientSession
​

Currently, I am using the following function:

async def fetch_and_save(link, path, client):
    async with await client.get(link) as response:
        contents = await response.read()

        if not os.path.exists(os.path.dirname(path)):
            os.makedirs(os.path.dirname(path))
        with open(path, "wb") as f:
            f.write(contents)

JavaScript
 
async def fetch_and_save(link, path, client):
    async with await client.get(link) as response:
        contents = await response.read()
​
        if not os.path.exists(os.path.dirname(path)):
            os.makedirs(os.path.dirname(path))
        with open(path, "wb") as f:
            f.write(contents)
​

My main call looks like this:

async def fetch_and_save_all(inputs):
    async with ClientSession() as client:
        tasks = [asyncio.ensure_future(fetch_and_save(link, path, client))
                 for link, path in inputs]
        for f in asyncio.as_completed(tasks):
            await f


def main(inputs):
    loop = asyncio.get_event_loop()
    loop.run_until_complete(fetch_and_save_all(inputs))

if __name__ == "__main__":
    inputs = [
        (f"https://httpbin.org/range/{i}", f"./tmp/{i}.txt") for i in range(1, 10)]
    main(inputs)

JavaScript
 
async def fetch_and_save_all(inputs):
    async with ClientSession() as client:
        tasks = [asyncio.ensure_future(fetch_and_save(link, path, client))
                 for link, path in inputs]
        for f in asyncio.as_completed(tasks):
            await f
​
​
def main(inputs):
    loop = asyncio.get_event_loop()
    loop.run_until_complete(fetch_and_save_all(inputs))
​
if __name__ == "__main__":
    inputs = [
        (f"https://httpbin.org/range/{i}", f"./tmp/{i}.txt") for i in range(1, 10)]
    main(inputs)
​

Given this basic example, is it possible to separate the fetch and save functionality in fetch_and_save?

Advertisement

Answer

Just create independent functions for fetch portion and save portion.

async def fetch(link, client):
    async with await client.get(link) as response:
        contents = await response.read()
    return contents

def save(contents, path):
    if not os.path.exists(os.path.dirname(path)):
        os.makedirs(os.path.dirname(path))
    with open(path, 'wb') as f:
        bytes_written = f.write(contents)
    return bytes_written

async def fetch_and_save(link, path, client):
    contents = await fetch(link, client)
    save(contents, path)

JavaScript
 
async def fetch(link, client):
    async with await client.get(link) as response:
        contents = await response.read()
    return contents
​
def save(contents, path):
    if not os.path.exists(os.path.dirname(path)):
        os.makedirs(os.path.dirname(path))
    with open(path, 'wb') as f:
        bytes_written = f.write(contents)
    return bytes_written
​
async def fetch_and_save(link, path, client):
    contents = await fetch(link, client)
    save(contents, path)
​

Advertisement