Skip to content
Advertisement

Get embedded JSON data in div using selenium

I am new to scraping with Selenium and am stumped on how to extract a JSON that is conveniently available directly within a div. The div even contains a tag named data-json

The basic code I have so far is:

#this works perfectly

from selenium import webdriver
DRIVER_PATH = '/my/path/to/driver'

driver = webdriver.Chrome(executable_path=DRIVER_PATH) 
driver.get('https://www.genecards.org/cgi-bin/carddisp.pl?gene=IL1R1')


#this errors out
element = driver.find_element_by_data_type('Biological')

#this also errors out
driver.execute_script('return JSON.stringify(window.dataJSON)

The JSON that I am looking for is a div within a div, where the outermost div has a tag data-type='Biological' and the innermost div contains the JSON data itself:

enter image description here

Note that this webpage contains a few other embedded JSONs so the data-type in the parent div is important to specify. Any guidance on how to extract this JSON as a string variable would be appreciated!

Advertisement

Answer

I think you could try this

from selenium import webdriver
DRIVER_PATH = '/my/path/to/driver'

driver = webdriver.Chrome(executable_path=DRIVER_PATH) 
driver.get('https://www.genecards.org/cgi-bin/carddisp.pl?gene=IL1R1')

element = driver.find_element_by_css_selector('#enhancerControllerComponent > div').get_attribute("data-json")
# goTermController
# element = driver.find_element_by_css_selector('#go_func > div.gc-subsection-inner-wrap > div > div').get_attribute("data-json")

j = json.loads(str(element))

print(j)
print(type(j))

if u want to fetch new element just do the follow step
1.find element in “Developer tools” 2.right click “the element”
3.”copy”->”copy selector”
4.changefind_element_by_css_selectorpart like what i do

User contributions licensed under: CC BY-SA
3 People found this is helpful
Advertisement