Skip to content
Advertisement

How can I make Selenium get a href with PYTHON?

I tried all the methods I found here https://selenium-python.readthedocs.io/locating-elements.html but I couldn’t get that link from the page.

I need to get the link from the href

HTML page code:

<p style="margin: 0px 30px 30px 30px;text-align: center;-webkit-box-sizing: border-box;max-width: 100%%;-moz-box-sizing: border-box;box-sizing: border-box;">
                                <!--[if mso]>
                                    <v:roundrect xmlns:v="urn:schemas-microsoft-com:vml" xmlns:w="urn:schemas-microsoft-com:office:word" href="https://u.pcloud.com/track?url=aHR0cHM6Ly91LnBjbG91ZC5jb20vPyNwYWdlPXZlcmlmeW1haWwmY29kZT1UTE9TN1pUWXN6MmxPSWh2bWhZUU9jd2RxSHk3bUJoejk3&token=j7yZTLOS7Z7Z87Zh354dzp5jNRuIf7aJshX1XzSehQX" style="height:49px;v-text-anchor:middle;width:289px;" arcsize="7%%" strokecolor="#88CC17" fillcolor="#88CC17">
                                        <w:anchorlock/>
                                        <center style="color:#ffffff;font-family:sans-serif;font-size:13px;font-weight:bold;">CLICK TO VERIFY EMAIL</center>
                                    </v:roundrect>
                                <![endif]-->
                                <a href="https://u.pcloud.com/track?url=aHR0cHM6Ly91LnBjbG91ZC5jb20vPyNwYWdlPXZlcmlmeW1haWwmY29kZT1UTE9TN1pUWXN6MmxPSWh2bWhZUU9jd2RxSHk3bUJoejk3&token=j7yZTLOS7Z7Z87Zh354dzp5jNRuIf7aJshX1XzSehQX" style="color: #FFF;background-color: #88CC17;text-decoration: none;width: 285px;font-weight: 500;display: inline-block;padding: 13px 0px 13px 0px;border: 2px solid #88CC17;border-radius: 3px;-moz-border-radius: 3px;-webkit-border-radius: 3px;mso-hide:all;">
                                    CLICK TO VERIFY EMAIL
                                </a>

Some of the unsuccessful attempts:

link_ativador = navegador.find_element(By.LINK_TEXT, 'CLICK TO VERIFY EMAIL')
print(link_ativador)

link_ativador = navegador.find_element(By.XPATH, '/html/body/table/tbody/tr[2]/td/table/tbody/tr[3]/td/table/tbody/tr/td/p[3]/a').get_attribute('href')
print(link_ativador)

link_ativador = navegador.find_element(By.LINK_TEXT, 'CLICK TO VERIFY EMAIL').get_attribute('href')
print(link_ativador)

+Some of the unsuccessful attempts:

link_try1 = WebDriverWait(navegador, 20).until(EC.presence_of_element_located((By.XPATH, '//a[text()="CLICK TO VERIFY EMAIL"]'))).get_attribute('href')
print(link_try1)

link_try2 = (WebDriverWait(navegador, 20).until(EC.visibility_of_element_located((By.PARTIAL_LINK_TEXT, "CLICK TO VERIFY EMAIL"))).get_attribute("href"))
print(link_try2)

link_try3 = (WebDriverWait(navegador, 20).until(EC.visibility_of_element_located((By.XPATH, "//a[contains(., 'CLICK TO VERIFY EMAIL')]"))).get_attribute("value"))
print(link_try3)

The full code:

from selenium import webdriver
from webdriver_manager.chrome import ChromeDriverManager
from selenium.webdriver.chrome.service import Service
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
import time

# Chrome e Proxy Tor
servico = Service(ChromeDriverManager().install())
proxy = "socks5://127.0.0.1:9150"  # Tor
chrome_options = webdriver.ChromeOptions()
chrome_options.add_argument(f"--proxy-server={proxy}")
navegador = webdriver.Chrome(service=servico, options=chrome_options)

navegador.get('https://tmail.link/inbox/xxxxx.xxxxxx@tmail.link/')

Advertisement

Answer

You should be able to get that href with the following locator:

link_url = WebDriverWait(browser, 10).until(EC.presence_of_element_located((By.XPATH, '//a[text()="CLICK TO VERIFY EMAIL"]'))).get_attribute('href')

You will also need to import:

from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

Selenium docs: https://www.selenium.dev/documentation/

EDIT: Upon OP’s confirmation of the actual url, we can see that link is in an iframe, so the following code will retrieve it:

from selenium import webdriver
from selenium.webdriver.chrome.service import Service
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.support.ui import Select
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.action_chains import ActionChains
from selenium.webdriver.common.keys import Keys

chrome_options = Options()
chrome_options.add_argument("--no-sandbox")
chrome_options.add_argument('disable-notifications')
chrome_options.add_argument("window-size=1280,720")

webdriver_service = Service("chromedriver/chromedriver") ## path to where you saved chromedriver binary
browser = webdriver.Chrome(service=webdriver_service, options=chrome_options)
actions = ActionChains(browser)

url = 'https://tmail.link/inbox/ambers.rushing@tmail.link/b2b20ce5eeb40a74a8c4ce7d4438883fa44a69c1/'
browser.get(url) 

WebDriverWait(browser, 20).until(EC.frame_to_be_available_and_switch_to_it((By.XPATH, "//iframe[@src='1']")))
specific_url = WebDriverWait(browser, 20).until(EC.visibility_of_element_located((By.PARTIAL_LINK_TEXT, "CLICK TO VERIFY EMAIL"))).get_attribute("href")
print(specific_url)

This will print in terminal:

switched to iframe
https://u.pcloud.com/track?url=aHR0cHM6Ly91LnBjbG91ZC5jb20vPyNwYWdlPXZlcmlmeW1haWwmY29kZT1UTE9TN1pUWXN6MmxPSWh2bWhZUU9jd2RxSHk3bUJoejk3&token=j7yZTLOS7Z7Z87Zh354dzp5jNRuIf7aJshX1XzSehQX

The setup in the code above is chrome/chromedriver on linux, however you can adapt it to your own, just observe the imports, and the code after defining the browser/driver.

Advertisement