Following links and crawling them

Question

I was trying to make a crawler to follow links, with this code I was able to get the links but the part of entering the links and getting the information I need was not working, so a friend helped me to come up with this code It gets the json with the page items, but in loop number 230

Accepted Answer

I after many sleepless nights solved my problem, I will leave it here in case it helps someone.import timeimport requestsimport pandas as pdfrom bs4 import BeautifulSoupfrom selenium import webdriverfrom selenium.webdriver.support.ui import WebDriverWaitfrom selenium.webdriver.support import expected_conditions as ECfrom selenium.webdriver.common.by import Byfrom selenium.webdriver.chrome.options import Optionsimport jsonclass DicionarioSpider(scrapy.Spider):    name = 'dicionario'    allowed_domains = ['www.mediktor.com']    start_urls = ['http://www.mediktor.com/']    def parse(self, response):        url = "https://www.mediktor.com/pt-br/glossario"        option = Options()        option.headless = True        driver = webdriver.Chrome(options=option)        driver.get(url)        time.sleep(10)        el_links = driver.find_elements(            By.XPATH, "//a[@class='mdk-dictionary-list__glossary-item']")        urls = []        nome_doenca = []        for i in range(len(el_links)):            urls.append(el_links[i].get_attribute('href'))        for link in urls:            driver.get(link)            myElem = WebDriverWait(driver, 5).until(                EC.presence_of_element_located((By.XPATH,                                                "//div[@class='mdk-conclusion-detail__main-title']"                                                )))            nome_source = driver.find_element(By.XPATH,                                              "//div[@class='mdk-conclusion-detail__main-title']"                                              ).text            nome_doenca.append(nome_source)            driver.back()        print(nome_doenca)        driver.quit()I just modified my code and didn&#8217;t use scrapy, just the selenium selectors.

Advertisement

Answer