Skip to content
Advertisement

How to select all tags HTML

From this webpage I need to select all tags <b> </b> with BeautifulSoup4.

url = "http://lib.ru/GrepSearch?Search=%E3%E5%F0%EE%E9+%ED%E0%F8%E5%E3%EE+%E2%F0%E5%EC%E5%ED%E8"
r = requests.get(url)
soup = BeautifulSoup(r.text,'html.parser')
author = soup.select('b')
print(author)

I have tried using find_all() and select() but they fail to show all <b> tags when used in the array

Advertisement

Answer

There are different parsers used in parsing a html document, the most used one is ‘html.parser’. I have used lxml here which uses both xml and html to parse through a document. This code here should give you the raw output you have asked for(Author and Book name). You still have to process it to get your desired output.

import requests
from bs4 import BeautifulSoup

requests = requests.get('http://lib.ru/GrepSearch?Search=history')

src = requests.content

soup = BeautifulSoup(src , 'lxml')

b_tags = soup.find_all('b')

for b in b_tags:
    print(b.text)

Output is like:

enter image description here

User contributions licensed under: CC BY-SA
7 People found this is helpful
Advertisement