I am new to scraping with Selenium and am stumped on how to extract a JSON that is conveniently available directly within a div. The div even contains a tag named data-json
The basic code I have so far is:
#this works perfectly from selenium import webdriver DRIVER_PATH = '/my/path/to/driver' driver = webdriver.Chrome(executable_path=DRIVER_PATH) driver.get('https://www.genecards.org/cgi-bin/carddisp.pl?gene=IL1R1') #this errors out element = driver.find_element_by_data_type('Biological') #this also errors out driver.execute_script('return JSON.stringify(window.dataJSON)
The JSON that I am looking for is a div within a div, where the outermost div has a tag data-type='Biological'
and the innermost div contains the JSON data itself:
Note that this webpage contains a few other embedded JSONs so the data-type
in the parent div is important to specify. Any guidance on how to extract this JSON as a string variable would be appreciated!
Advertisement
Answer
I think you could try this
from selenium import webdriver DRIVER_PATH = '/my/path/to/driver' driver = webdriver.Chrome(executable_path=DRIVER_PATH) driver.get('https://www.genecards.org/cgi-bin/carddisp.pl?gene=IL1R1') element = driver.find_element_by_css_selector('#enhancerControllerComponent > div').get_attribute("data-json") # goTermController # element = driver.find_element_by_css_selector('#go_func > div.gc-subsection-inner-wrap > div > div').get_attribute("data-json") j = json.loads(str(element)) print(j) print(type(j))
if u want to fetch new element just do the follow step
1.find element in “Developer tools”
2.right click “the element”
3.”copy”->”copy selector”
4.changefind_element_by_css_selector
part like what i do