I’m trying a web scraping in bs4 and I don’t know what it is, Pls Someone explain it to me tnx
JavaScript
x
2
1
name = div.contents[0].string + div.contents[1]
2
Advertisement
Answer
The contents
attribute holds a list of child elements of the element. The .string
attribute of an element contains the text content for the element.
Using this page as an example:
JavaScript
1
13
13
1
import requests
2
from bs4 import BeautifulSoup
3
from pprint import pprint
4
5
resp = requests.get("https://stackoverflow.com/questions/73842279/what-is-contents-in-beautifulsoup4-and-the-number-string")
6
soup = BeautifulSoup(resp.text, 'html.parser')
7
8
for elem in soup.find_all('div'):
9
if elem.has_attr('id') and elem['id'].strip() == "question-header":
10
pprint(elem.contents)
11
12
pprint(elem.contents[1].string)
13
output for elem.contents
JavaScript
1
12
12
1
['n',
2
<h1 class="fs-headline1 ow-break-word mb8 flex--item fl1" itemprop="name"><a class="question-hyperlink" href="/questions/73842279/what-is-contents-in-beautifulsoup4-and-the-number-string">What is conten
3
ts in beautifulsoup4 and the number string?</a></h1>,
4
'n',
5
<div class="ml12 aside-cta flex--item print:d-none sm:ml0 sm:mb12 sm:order-first sm:as-end">
6
<a class="ws-nowrap s-btn s-btn__primary" href="/questions/ask">
7
Ask Question
8
</a>
9
</div>,
10
'n']
11
12
output for elem.contents[1].string
JavaScript
1
2
1
'What is contents in beautifulsoup4 and the number string?'
2