Slow scrolling down the page using Selenium

Question

I'm trying to scrape some data from flight search page. This page works this way: You fill in a form and then you click on the button search - this is ok. When you click the button you are redirected to the page with results and here is the problem. This page is adding continuously results for example for one

Accepted Answer

Here is a different approach that worked for me involving scrolling into view of the last search result and waiting for additional elements to load before scrolling again:# -*- coding: utf-8 -*-from selenium import webdriverfrom selenium.webdriver.common.by import Byfrom selenium.webdriver.support.ui import WebDriverWaitfrom selenium.common.exceptions import StaleElementReferenceExceptionfrom selenium.webdriver.support import expected_conditions as ECclass wait_for_more_than_n_elements(object):    def __init__(self, locator, count):        self.locator = locator        self.count = count    def __call__(self, driver):        try:            count = len(EC._find_elements(driver, self.locator))            return count >= self.count        except StaleElementReferenceException:            return Falsedriver = webdriver.Firefox()dep_airport = ['BTS', 'BRU', 'PAR']arr_airport = 'MAD'dep_date = '2015-07-15'arr_date = '2015-08-15'airports_string = str(r'%20').join(dep_airport)dep_airport = airports_stringurl = "https://www.pelikan.sk/sk/flights/list?dfc=C%s&dtc=C%s&rfc=C%s&rtc=C%s&dd=%s&rd=%s&px=1000&ns=0&prc=&rng=1&rbd=0&ct=0" % (dep_airport, arr_airport, arr_airport, dep_airport, dep_date, arr_date)driver.maximize_window()driver.get(url)wait = WebDriverWait(driver, 60)wait.until(EC.invisibility_of_element_located((By.XPATH, '//img[contains(@src, "loading")]')))wait.until(EC.invisibility_of_element_located((By.XPATH,                                               u'//div[. = "Poprosíme o trpezlivosť, hľadáme pre Vás ešte viac letov"]/preceding-sibling::img')))while True:  # TODO: make the endless loop end    results = driver.find_elements_by_css_selector("div.flightbox")    print "Results count: %d" % len(results)    # scroll to the last element    driver.execute_script("arguments[0].scrollIntoView();", results[-1])    # wait for more results to load    wait.until(wait_for_more_than_n_elements((By.CSS_SELECTOR, 'div.flightbox'), len(results)))Notes:you would need to figure out when to stop the loop &#8211; for example, at a particular len(results) value wait_for_more_than_n_elements is a custom Expected Condition which helps to identify when the next portion is loaded and we can scroll again

Advertisement

Answer