I would like to scroll until the end of a page like : https://fr.finance.yahoo.com/quote/GM/history?period1=1290038400&period2=1612742400&interval=1d&filter=history&frequency=1d&includeAdjustedClose=true
The fact is using this :
JavaScript
x
13
13
1
# # Get scroll height after first time page load
2
last_height = driver.execute_script("return document.body.scrollHeight")
3
while True:
4
# Scroll down to bottom
5
driver.execute_script("window.scrollTo(0, document.body.scrollHeight);")
6
# Wait to load page
7
time.sleep(2)()
8
# Calculate new scroll height and compare with last scroll height
9
new_height = driver.execute_script("return document.body.scrollHeight")
10
if new_height == last_height:
11
break
12
last_height = new_height
13
does not work. yes it should work for pages with infinite loads but doesn’t work for yahoo finance, which has a finite number of loads but the condition should break when it reachs the end. So I’m quite confuse at the moment.
We could also use :
JavaScript
1
6
1
while driver.find_element_by_tag_name('tfoot'):
2
# Scroll down three times to load the table
3
for i in range(0, 3):
4
driver.execute_script("window.scrollBy(0, 5000)")
5
time.sleep(2)
6
but it sometimes blocks at a certain loads.
What would be the best way to do this ?
Advertisement
Answer
Requires pip install undetected-chromedriver
, but will get the job done.
It’s just my webdriver of choice, you can also do the exact same with normal selenium.
JavaScript
1
25
25
1
from time import sleep as s
2
from selenium.webdriver.common.keys import Keys
3
from selenium.webdriver.common.by import By
4
from selenium.webdriver.support import expected_conditions as EC
5
from selenium.webdriver.support.ui import WebDriverWait
6
7
import undetected_chromedriver as uc
8
options = uc.ChromeOptions()
9
options.headless = False
10
driver = uc.Chrome(options=options)
11
12
driver.get('https://fr.finance.yahoo.com/quote/GM/history?period1=1290038400&period2=1612742400&interval=1d&filter=history&frequency=1d&includeAdjustedClose=true')
13
14
WebDriverWait(driver, 10).until(EC.presence_of_element_located((By.CSS_SELECTOR, '#consent-page > div > div > div > div.wizard-body > div.actions.couple > form > button'))).click() #clicks the cookie warning or whatever
15
16
last_scroll_pos=0
17
while True:
18
WebDriverWait(driver, 10).until(EC.presence_of_element_located((By.CSS_SELECTOR, 'body'))).send_keys(Keys.DOWN)
19
s(.01)
20
current_scroll_pos=str(driver.execute_script('return window.pageYOffset;'))
21
if current_scroll_pos == last_scroll_pos:
22
print('scrolling is finished')
23
break
24
last_scroll_pos=current_scroll_pos
25