I want to remove target tr block with text, when i run it i got perfect output but there is a problem i have seen that it scraping <tr><td>Domain</td><td>Last Resolved Date</td></tr>
actually i don’t want this line in my output so how can i remove it.Code bellow
Got fix
Old Code
import requests from bs4 import BeautifulSoup headers = {'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/39.0.2171.95 Safari/537.36'} url = "https://viewdns.info/reverseip/?host=github.com&t=1" text = requests.get(url, headers=headers).text soup = BeautifulSoup(text, 'html.parser') table = soup.find('table', attrs={'border':'1'}) domain = table.findAll('td', attrs={'align':None}) for line in domain: print(line.text)
Fixed
import requests from bs4 import BeautifulSoup headers = {'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/39.0.2171.95 Safari/537.36'} url = "https://viewdns.info/reverseip/?host=github.com&t=1" text = requests.get(url, headers=headers).text soup = BeautifulSoup(text, 'html.parser') table = soup.find('table', attrs={'border':'1'}) domain = table.findAll('td', attrs={'align':None})[2:] for line in domain: print(line.text)
Advertisement
Answer
Try the code.
import requests from bs4 import BeautifulSoup headers = {'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/39.0.2171.95 Safari/537.36'} url = "https://viewdns.info/reverseip/?host=github.com&t=1" text = requests.get(url, headers=headers).text soup = BeautifulSoup(text, 'html.parser') table = soup.find('table', attrs={'border':'1'}) domain = table.findAll('td', attrs={'align':None})[2:] for line in domain: print(line.text)