Struggling with Selenium as a new backend developer

Question

I'm very new to web scraping and am trying to build an algorithm to pull all of the information from my school's course catalog. What I have so far is: I've had much more but keep running into Selenium errors about not being able to locate the information when it is correct. Can anyone get me on the right track?

Accepted Answer

I&#8217;ve played around with your code and used it as a base for something a bit more what you&#8217;d expect.Try this:import textwrapfrom bs4 import BeautifulSoupfrom selenium import webdriverfrom selenium.webdriver.chrome.options import Optionsoptions = Options()options.headless = Falsedriver = webdriver.Chrome(options=options)driver.get("https://webapps.lsa.umich.edu/CrsMaint/Public/CB_PublicBulletin.aspx")driver.find_element_by_xpath('//*[@id="ContentPlaceHolder1_ddlPage"]/option[4]').click()driver.find_element_by_xpath('//*[@name="ctl00$ContentPlaceHolder1$ddlTerm"]/option[1]').click()driver.find_element_by_xpath('//*[@name="ctl00$ContentPlaceHolder1$ddlSubject"]/option[8]').click()driver.find_element_by_xpath('//*[@id="ContentPlaceHolder1_btnSearch"]').click()tables = BeautifulSoup(driver.page_source, "html.parser").find_all("td", {"valign": "top"})def wrapper(text: str, width: int = 120) -> str:    return "n".join(textwrap.wrap(text, width=width)) + "n"for table in tables:    try:        title = table.find("b").getText(strip=True)        course_info = " ".join(table.find("i").text.split())        desc = table.find("p").getText(strip=True)        urls = [f"{a.text.strip()} - {a['href']}" for a in table.find_all("a")]        print(title)        print(wrapper(course_info))        print(wrapper(desc))        print("n".join(urls) if urls else "No URL's found.")        print("-" * 120)    except AttributeError:        continuedriver.close()In my terminal the output (it&#8217;s just a small part of it, as there&#8217;s lots) looks like this:

Advertisement

Answer