Skip to content
Advertisement

The iteration loop is not working properly for API

There is an API that only produces one hundred results per page. I am trying to make a while loop so that it goes through all pages and takes results from all pages, but it does not work properly.

This script goes through the pages:

params = dict(
    order_by='salary_desc',
    text=keyword,
    area=area,
    period=30, # days
    per_page=100,
    page = 0,
    no_magic='false',  # disable magic
    search_field='name'  # available: name, description, company_name
)
pages = []
while True:
  params["page"] += 1
  response = requests.get(BASE_URL + '/vacancies', headers={'User-Agent': generate_user_agent()}, params=params,)
  items = response.json()['items']
  if not items:
    break
  pages.append(items) # Do it for each page
response

At startup:

params
{'area': 1,
 'no_magic': 'false',
 'order_by': 'salary_desc',
 'page': 5,
 'per_page': 100,
 'period': 30,
 'search_field': 'name',
 'text': '"python"'}

He sees five pages.

When I look at the variable after execution:

len(pages)
4

He only sees four pages.

If I understood correctly, he does not see the zero page (pages in the api start at zero).

Please tell me how you can fix this error?

Complete script in colab at this link https://colab.research.google.com/drive/14KddVLTyH3LkcE-LmHm7EooTYMM7b0zB?usp=sharing

Advertisement

Answer

You are incrementing the page prior to grabbing the response. Just reorder like so.

while True:
  response = requests.get(BASE_URL + '/vacancies', headers={'User-Agent': generate_user_agent()}, params=params,)
  items = response.json()['items']
  if not items:
    break
  pages.append(items) # Do it for each page
  params["page"] += 1
Advertisement