Skip to content
Advertisement

scrape next page by changing the number of page in URL

I have trouble scraping information on the next pages. I also have a problem when some tags change like when the website developer changes an “a href” with “h2 class” when I reach the appart_response= requests.get(link)

Please can you check with me the following code:

JavaScript

Advertisement

Answer

There is second <a> with text En savoir plus.

Sometimes it may have href="#" but usually when it has # then <h2><a> exists and has correct href. So you can try to search both links and use correct one.

I use 'div', {'class': 'contentBox'} instead of 'h2',{'class':'listingTit'} and then find('a') gives me first <a> (if exists) or second <a> and I get correct href.

To make sure I use if/else to skip room when I don’t have href


Page has two 'a',{'class':'arrowDot'} (left arrow, right arrow) but sometimes left arrow is hidden – but still it need to get second arrow to get correct url to next page

JavaScript

Full working code:

JavaScript

Result (for first page):

JavaScript
User contributions licensed under: CC BY-SA
7 People found this is helpful
Advertisement