I’m trying to extract the ‘src’ from this , but I’m not succeeding. This page’s is dynamic, it only appears if I search.
Site: http://191.253.16.180:8080/ConsultaLei/Default.aspx?numero=3001
view-source:http://191.253.16.180:8080/ConsultaLei/Default.aspx?numero=3001
r = requests.get("http://191.253.16.180:8080/ConsultaLei/Default.aspx?numero=3001") arquivo = BeautifulSoup(r.content, "html.parser") for link in arquivo.find_all("iframe"): print(link)
Advertisement
Answer
To simulate POST on this site request you can use this example:
import requests from bs4 import BeautifulSoup url = "http://191.253.16.180:8080/ConsultaLei/Default.aspx" soup = BeautifulSoup(requests.get(url).content, "html.parser") data = {} for inp in soup.select("input[value]"): data[inp["name"]] = inp["value"] data["ctl00$MainContent$txtNumero"] = "3001" # <-- this is your number data["ctl00$MainContent$ddlEspecie"] = "" data["ctl00$MainContent$ddlAno"] = "" data["ctl00$MainContent$txtConteudo"] = "" data["ctl00$MainContent$txtEmenta"] = "" data["ctl00$MainContent$imgBuscar.x"] = "1" data["ctl00$MainContent$imgBuscar.y"] = "9" soup = BeautifulSoup(requests.post(url, data=data).content, "html.parser") print(soup.iframe["src"])
Prints:
../procuradoriacg/Leis1994/8277_LEI30011994pag0001_strDocumentoOficial.pdf
EDIT: To get multiple pages:
import requests from bs4 import BeautifulSoup url = "http://191.253.16.180:8080/ConsultaLei/Default.aspx" soup = BeautifulSoup(requests.get(url).content, "html.parser") data = {} for inp in soup.select("input[value]"): data[inp["name"]] = inp["value"] data["ctl00$MainContent$ddlEspecie"] = "" data["ctl00$MainContent$ddlAno"] = "" data["ctl00$MainContent$txtConteudo"] = "" data["ctl00$MainContent$txtEmenta"] = "" data["ctl00$MainContent$imgBuscar.x"] = "1" data["ctl00$MainContent$imgBuscar.y"] = "9" for i in range(3000, 3010): data["ctl00$MainContent$txtNumero"] = i s = BeautifulSoup(requests.post(url, data=data).content, "html.parser") if s.find("iframe"): print(i, s.iframe["src"]) else: print(i, "Not Found")
Prints:
3000 Not Found 3001 ../procuradoriacg/Leis1994/8277_LEI30011994pag0001_strDocumentoOficial.pdf 3002 Not Found 3003 ../procuradoriacg/Leis1994/8279_LEI30031994pag0001_strDocumentoOficial.pdf 3004 Not Found 3005 Not Found 3006 ../procuradoriacg/Leis1994/8282_LEI30061994pag0001_strDocumentoOficial.pdf 3007 Not Found 3008 Not Found 3009 Not Found