I’m trying to extract the ‘src’ from this , but I’m not succeeding. This page’s is dynamic, it only appears if I search.
Site: http://191.253.16.180:8080/ConsultaLei/Default.aspx?numero=3001
view-source:http://191.253.16.180:8080/ConsultaLei/Default.aspx?numero=3001
r = requests.get("http://191.253.16.180:8080/ConsultaLei/Default.aspx?numero=3001")
arquivo = BeautifulSoup(r.content, "html.parser")
for link in arquivo.find_all("iframe"):
print(link)
Advertisement
Answer
To simulate POST on this site request you can use this example:
import requests
from bs4 import BeautifulSoup
url = "http://191.253.16.180:8080/ConsultaLei/Default.aspx"
soup = BeautifulSoup(requests.get(url).content, "html.parser")
data = {}
for inp in soup.select("input[value]"):
data[inp["name"]] = inp["value"]
data["ctl00$MainContent$txtNumero"] = "3001" # <-- this is your number
data["ctl00$MainContent$ddlEspecie"] = ""
data["ctl00$MainContent$ddlAno"] = ""
data["ctl00$MainContent$txtConteudo"] = ""
data["ctl00$MainContent$txtEmenta"] = ""
data["ctl00$MainContent$imgBuscar.x"] = "1"
data["ctl00$MainContent$imgBuscar.y"] = "9"
soup = BeautifulSoup(requests.post(url, data=data).content, "html.parser")
print(soup.iframe["src"])
Prints:
../procuradoriacg/Leis1994/8277_LEI30011994pag0001_strDocumentoOficial.pdf
EDIT: To get multiple pages:
import requests
from bs4 import BeautifulSoup
url = "http://191.253.16.180:8080/ConsultaLei/Default.aspx"
soup = BeautifulSoup(requests.get(url).content, "html.parser")
data = {}
for inp in soup.select("input[value]"):
data[inp["name"]] = inp["value"]
data["ctl00$MainContent$ddlEspecie"] = ""
data["ctl00$MainContent$ddlAno"] = ""
data["ctl00$MainContent$txtConteudo"] = ""
data["ctl00$MainContent$txtEmenta"] = ""
data["ctl00$MainContent$imgBuscar.x"] = "1"
data["ctl00$MainContent$imgBuscar.y"] = "9"
for i in range(3000, 3010):
data["ctl00$MainContent$txtNumero"] = i
s = BeautifulSoup(requests.post(url, data=data).content, "html.parser")
if s.find("iframe"):
print(i, s.iframe["src"])
else:
print(i, "Not Found")
Prints:
3000 Not Found 3001 ../procuradoriacg/Leis1994/8277_LEI30011994pag0001_strDocumentoOficial.pdf 3002 Not Found 3003 ../procuradoriacg/Leis1994/8279_LEI30031994pag0001_strDocumentoOficial.pdf 3004 Not Found 3005 Not Found 3006 ../procuradoriacg/Leis1994/8282_LEI30061994pag0001_strDocumentoOficial.pdf 3007 Not Found 3008 Not Found 3009 Not Found