Could this selenium code be recreated using scrapy?

Question

I&#8217;m interested in getting a better idea of what scrapy can do. Here is a very simple selenium code that interacts with a website, fills in some boxes, clicks some elements and downloads a file. Could this code be replicated using scrapy?, so that a code is written using scrapy that does the exact same t…

Accepted Answer

"selenium code be recreated using scrapy" is also working fine with SeleniuRequest which is superfast than general selenium. You need scrapy project.It works as headless mode but always get screenshot for each step.script:import scrapyfrom scrapy_selenium import SeleniumRequestfrom selenium import webdriverfrom selenium.webdriver.chrome.options import Optionsfrom selenium.webdriver.support.ui import WebDriverWaitfrom selenium.webdriver.common.by import Byfrom selenium.webdriver.support import expected_conditions as ECclass TestSpider(scrapy.Spider):    name = 'test'    def start_requests(self):        yield SeleniumRequest(            url='https://www.ons.gov.uk',            callback=self.parse,            wait_time = 3,            screenshot = True        )    def parse(self, response):        driver = response.meta['driver']        driver.save_screenshot('screenshot.png')        WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.NAME, "q"))).send_keys("Education and childcare")        driver.save_screenshot('screenshot_1.png')        click_button=driver.find_element_by_xpath('//*[@id="nav-search-submit"]').click()        driver.save_screenshot('screenshot_2.png')        click_button=driver.find_element_by_xpath('//*[@id="results"]/div[1]/div[2]/div[1]/h3/a/span').click()        click_button=driver.find_element_by_xpath('//*[@id="main"]/div[2]/div[1]/section/div/div[1]/div/div[2]/h3/a/span').click()        click_button=driver.find_element_by_xpath('//*[@id="main"]/div[2]/div/div[1]/div[2]/p[2]/a').click()    Screenshotsettings.py file:You have to add the following options in settings.py file# MiddlewareDOWNLOADER_MIDDLEWARES = {    'scrapy_selenium.SeleniumMiddleware': 800}# Seleniumfrom shutil import whichSELENIUM_DRIVER_NAME = 'chrome'SELENIUM_DRIVER_EXECUTABLE_PATH = which('chromedriver')SELENIUM_DRIVER_ARGUMENTS = ['--headless']SeleniumRequestOutput:'downloader/response_status_count/200'screenshot of the project looks likeHow to download pdf using scrapyscreenshot

Advertisement

Answer