Skip to content
Advertisement

How to scroll at the end of a page with finite number of load ? Selenium – Python

I would like to scroll until the end of a page like : https://fr.finance.yahoo.com/quote/GM/history?period1=1290038400&period2=1612742400&interval=1d&filter=history&frequency=1d&includeAdjustedClose=true

The fact is using this :

# # Get scroll height after first time page load
 last_height = driver.execute_script("return document.body.scrollHeight")
 while True:
     # Scroll down to bottom
     driver.execute_script("window.scrollTo(0, document.body.scrollHeight);")
     # Wait to load page
     time.sleep(2)()
     # Calculate new scroll height and compare with last scroll height
     new_height = driver.execute_script("return document.body.scrollHeight")
     if new_height == last_height:
         break
     last_height = new_height

does not work. yes it should work for pages with infinite loads but doesn’t work for yahoo finance, which has a finite number of loads but the condition should break when it reachs the end. So I’m quite confuse at the moment.

We could also use :

while driver.find_element_by_tag_name('tfoot'):
    # Scroll down three times to load the table
    for i in range(0, 3):
        driver.execute_script("window.scrollBy(0, 5000)")
        time.sleep(2)

but it sometimes blocks at a certain loads.

What would be the best way to do this ?

Advertisement

Answer

Requires pip install undetected-chromedriver, but will get the job done. It’s just my webdriver of choice, you can also do the exact same with normal selenium.

from time import sleep as s
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.support.ui import WebDriverWait

import undetected_chromedriver as uc
options = uc.ChromeOptions()
options.headless = False
driver = uc.Chrome(options=options)

driver.get('https://fr.finance.yahoo.com/quote/GM/history?period1=1290038400&period2=1612742400&interval=1d&filter=history&frequency=1d&includeAdjustedClose=true')

WebDriverWait(driver, 10).until(EC.presence_of_element_located((By.CSS_SELECTOR, '#consent-page > div > div > div > div.wizard-body > div.actions.couple > form > button'))).click() #clicks the cookie warning or whatever

last_scroll_pos=0
while True:
    WebDriverWait(driver, 10).until(EC.presence_of_element_located((By.CSS_SELECTOR, 'body'))).send_keys(Keys.DOWN)
    s(.01)
    current_scroll_pos=str(driver.execute_script('return window.pageYOffset;'))
    if current_scroll_pos == last_scroll_pos:
        print('scrolling is finished')
        break
    last_scroll_pos=current_scroll_pos
User contributions licensed under: CC BY-SA
5 People found this is helpful
Advertisement