Skip to content
Advertisement

Webscaping table data with drop down menu help in either Pandas, Beautiful Soup or Selenium

I am trying to scrape data from this website:

https://www.shanghairanking.com/rankings/grsssd/2021

Initially pandas gets me out the gates and I can scrape the table but I am struggling with the drop down menus. I want to select the options next to the total score box which are PUB, CIT, etc. When I inspect the element it looks like maybe Javascript and the usual methods of interating over these options don’t work. I have tried Beutifalsoup and most recently Selenium to select the drop downs by hand. This works for the default table data ”’

import time
import pandas as pd
from selenium import webdriver
from selenium.webdriver.support.ui import Select
driver = webdriver.Chrome('/Users/martinbell/Downloads/chromedriver')
driver.get('https://www.shanghairanking.com/rankings/grsssd/2021')
submit = driver.find_element_by_xpath("//input[@value='CIT']").click()

”’ Doesn’t get me anywhere.

Advertisement

Answer

Your code would not work as you first have to click the dropdown open and then traverse through the options in the dropdown. Here is the refactored code.

Note that I have used time.sleep for instant purposes but for a robust code and good practice, use explicit wait such as WebdriverWait

driver.get('https://www.shanghairanking.com/rankings/grsssd/2021')
time.sleep(10)
driver.find_element(By.XPATH, "(//*[@class='inputWrapper'])[3]").click()
#The below commented code loops through all the dropdown options and performs actions.
# opt_ele = driver.find_elements(By.XPATH, "(//*[@class='rank-select'])[2]//*[@class='options']//li")
# for ele in opt_ele:
#     print(ele.text)
#     ele.click()
#     print('perform your actions here')
#     driver.find_element(By.XPATH, "(//*[@class='inputWrapper'])[3]").click()

# If you do not want to loop through but just want to select only CIT, here is the line:
driver.find_element(By.XPATH, "(//*[@class='rank-select'])[2]//*[@class='options']//li[text()='CIT']").click()
User contributions licensed under: CC BY-SA
3 People found this is helpful
Advertisement