I am new to scraping with Selenium and am stumped on how to extract a JSON that is conveniently available directly within a div. The div even contains a tag named data-json
The basic code I have so far is:
#this works perfectly
from selenium import webdriver
DRIVER_PATH = '/my/path/to/driver'
driver = webdriver.Chrome(executable_path=DRIVER_PATH)
driver.get('https://www.genecards.org/cgi-bin/carddisp.pl?gene=IL1R1')
#this errors out
element = driver.find_element_by_data_type('Biological')
#this also errors out
driver.execute_script('return JSON.stringify(window.dataJSON)
The JSON that I am looking for is a div within a div, where the outermost div has a tag data-type='Biological'
and the innermost div contains the JSON data itself:
Note that this webpage contains a few other embedded JSONs so the data-type
in the parent div is important to specify. Any guidance on how to extract this JSON as a string variable would be appreciated!
Advertisement
Answer
I think you could try this
from selenium import webdriver
DRIVER_PATH = '/my/path/to/driver'
driver = webdriver.Chrome(executable_path=DRIVER_PATH)
driver.get('https://www.genecards.org/cgi-bin/carddisp.pl?gene=IL1R1')
element = driver.find_element_by_css_selector('#enhancerControllerComponent > div').get_attribute("data-json")
# goTermController
# element = driver.find_element_by_css_selector('#go_func > div.gc-subsection-inner-wrap > div > div').get_attribute("data-json")
j = json.loads(str(element))
print(j)
print(type(j))
if u want to fetch new element just do the follow step
1.find element in “Developer tools”
2.right click “the element”
3.”copy”->”copy selector”
4.changefind_element_by_css_selector
part like what i do