I am trying to webscrape a site using Python, Selenium, Beautifulsoup. When I tried to get all the links ,It’ returning an invalid string. This is what I have tried Can someone help me please?
from time import sleep from selenium.webdriver.common.by import By from selenium import webdriver driver = webdriver.Chrome() driver.get('https://www.hirist.com/c/filter/mobile-applications-jobs-in-cochin%20kochi_trivandrum%20thiruvananthapuram-5-70_75-0-0-1-0-0-0-0-2.html?ref=homepagecat') sleep(10) links = driver.find_elements(by=By.XPATH, value='.//div[@class="jobfeed-wrapper multiple-wrapper"]') for link in links: link.get_attribute('href') print(link)
Advertisement
Answer
It is your selection with xpath
, you select the <div>
that do not have an href
attribute. Select also its first <a>
like .//div[@class="jobfeed-wrapper multiple-wrapper"]/a
and it will work:
links = driver.find_elements(by=By.XPATH, value='.//div[@class="jobfeed-wrapper multiple-wrapper"]/a') for link in links: print(link.get_attribute('href'))
Example
Instead of time
use WebDriverWait
to check if specific elements are available.
from selenium import webdriver from selenium.webdriver.common.by import By from webdriver_manager.chrome import ChromeDriverManager from selenium.webdriver.support.ui import WebDriverWait from selenium.webdriver.support import expected_conditions as EC url = 'https://www.hirist.com/c/filter/mobile-applications-jobs-in-cochin%20kochi_trivandrum%20thiruvananthapuram-5-70_75-0-0-1-0-0-0-0-2.html?ref=homepagecat' driver = webdriver.Chrome(ChromeDriverManager().install()) driver.maximize_window() driver.get(url) wait = WebDriverWait(driver, 10) links = wait.until(EC.presence_of_all_elements_located((By.XPATH, './/div[@class="jobfeed-wrapper multiple-wrapper"]/a'))) for link in links: print(link.get_attribute('href'))
Output
https://www.hirist.com/j/xforia-technologies-android-developer-javakotlin-10-15-yrs-1011605.html?ref=cl&jobpos=1&jobversion=2 https://www.hirist.com/j/firminiq-system-ios-developer-swiftobjective-c-3-10-yrs-1011762.html?ref=cl&jobpos=2&jobversion=2 https://www.hirist.com/j/firminiq-system-android-developer-kotlin-3-10-yrs-1011761.html?ref=cl&jobpos=3&jobversion=2 https://www.hirist.com/j/react-native-developer-mobile-app-designing-3-5-yrs-1009438.html?ref=cl&jobpos=4&jobversion=2 https://www.hirist.com/j/flutter-developer-iosandroid-apps-2-3-yrs-1008214.html?ref=cl&jobpos=5&jobversion=2 https://www.hirist.com/j/accubits-technologies-react-native-developer-ios-android-platforms-3-7-yrs-1003520.html?ref=cl&jobpos=6&jobversion=2 https://www.hirist.com/j/appincubator-react-native-developer-iosandroid-platform-2-7-yrs-1001957.html?ref=cl&jobpos=7&jobversion=2