Skip to content
Advertisement

Why is the get_attribute() function in selenium returning an empty string when inspecting the webpage shows the attribute?

I am trying to grab the src attribute from the video tag from this webpage. This shows where I see the video tag when I am inspecting the image. The XPath for the tag in safari is “//*[@id=”player”]/div[2]/div[4]/video”

This is my code:

from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.common.by import By
import os
os.environ["SELENIUM_SERVER_JAR"] = "selenium-server-standalone-2.41.0.jar"
browser = webdriver.Safari()
browser.get("https://mplayer.me/default.php?id=MTc3ODc3")
print(WebDriverWait(browser, 20).until(EC.visibility_of_element_located((By.TAG_NAME,"video"))).get_attribute("src"))
browser.quit()

Using .text instead og .get_Attribute also returns an empty string. I have to use safari and not chrome to get the src link because chrome uses a blob storage design due to which scraping via chrome shows “blob:https://mplayer.me/d420cb30-ed6e-4772-b169-ed33a5d3ee9f” instead of “https://wwwx18.gogocdn.stream/videos/hls/6CjH7KUeu18L4Y7ls0ohCw/1668685924/177877/81aa0af3891f4ef11da3f67f0d43ade6/ep.1.1657688313.m3u8” which is the link I want to get.

Advertisement

Answer

You can get a link to m3u8 file in Chrome from logs using Desired Capabilities

Here is one of the possible solutions to do this:

import json
from selenium import webdriver
from selenium.webdriver import DesiredCapabilities
from selenium.webdriver.chrome.service import Service


options = webdriver.ChromeOptions()
options.add_argument('--headless')
capabilities = DesiredCapabilities.CHROME
capabilities["goog:loggingPrefs"] = {"performance": "ALL"}
options.add_experimental_option("excludeSwitches", ["enable-automation", "enable-logging"])
service = Service(executable_path="path/to/your/chromedriver.exe")
driver = webdriver.Chrome(service=service, options=options, desired_capabilities=capabilities)

driver.get('https://mplayer.me/default.php?id=MTc3ODc3')
logs = driver.get_log('performance')

for log in logs:
    data = json.loads(log['message'])['message']['params'].get('request')
    if data and data['url'].endswith('.m3u8'):
        print(data['url'])

driver.quit()

Output:

https://wwwx18.gogocdn.stream/videos/hls/myv1spZ0483oSfvbo4bcbQ/1668706324/177877/81aa0af3891f4ef11da3f67f0d43ade6/ep.1.1657688313.m3u8

Tested on Win 10, Python 3.9.10, Selenium 4.5.0

User contributions licensed under: CC BY-SA
9 People found this is helpful
Advertisement