I am currently calling an external API many times and downloading the response’s content from each call. I am using aiohttp and asyncio to speed up this process, but am having trouble figuring out how to separate the fetch functionality from the save functionality.
Setup
JavaScript
x
5
1
import asyncio
2
import os
3
4
from aiohttp import ClientSession
5
Currently, I am using the following function:
JavaScript
1
9
1
async def fetch_and_save(link, path, client):
2
async with await client.get(link) as response:
3
contents = await response.read()
4
5
if not os.path.exists(os.path.dirname(path)):
6
os.makedirs(os.path.dirname(path))
7
with open(path, "wb") as f:
8
f.write(contents)
9
My main call looks like this:
JavaScript
1
17
17
1
async def fetch_and_save_all(inputs):
2
async with ClientSession() as client:
3
tasks = [asyncio.ensure_future(fetch_and_save(link, path, client))
4
for link, path in inputs]
5
for f in asyncio.as_completed(tasks):
6
await f
7
8
9
def main(inputs):
10
loop = asyncio.get_event_loop()
11
loop.run_until_complete(fetch_and_save_all(inputs))
12
13
if __name__ == "__main__":
14
inputs = [
15
(f"https://httpbin.org/range/{i}", f"./tmp/{i}.txt") for i in range(1, 10)]
16
main(inputs)
17
Given this basic example, is it possible to separate the fetch and save functionality in fetch_and_save
?
Advertisement
Answer
Just create independent functions for fetch
portion and save
portion.
JavaScript
1
16
16
1
async def fetch(link, client):
2
async with await client.get(link) as response:
3
contents = await response.read()
4
return contents
5
6
def save(contents, path):
7
if not os.path.exists(os.path.dirname(path)):
8
os.makedirs(os.path.dirname(path))
9
with open(path, 'wb') as f:
10
bytes_written = f.write(contents)
11
return bytes_written
12
13
async def fetch_and_save(link, path, client):
14
contents = await fetch(link, client)
15
save(contents, path)
16