Skip to content
Advertisement

How to get a url from a html code with re.finall in python

This is my html –

<html>
<body>
<img src="https://example.com/staks.jpg">
<a href="https://example.com">Link1</a>
<a href="https://example.com/page2">Link2</a>
</body>
</html>

I want to get url of Link1 with python as a variable –
import requests
import re

r = requests.get("https://myUrlExample.com")
s = r.text
pattern = re.compile("""href="https://""") # i don't know what pattern should i put
matches = re.findall(pattern, s)
if len(matches) > 0:
  print(matches[0])

Advertisement

Answer

Actaually i found this using github copilot –

s = r.text
patter = r'href="https://(.*?)"'
url = re.findall(pattern, s)
User contributions licensed under: CC BY-SA
6 People found this is helpful
Advertisement