I have a pretty specific question if anyone could help that would be awesome.
I am trying to scrape the songs from https://www.billboard.com/charts/hot-100/ and I am stuck on this code.
What should I put for tag = soup.find_all('???')
to get the title ‘About Damn Time.
JavaScript
x
13
13
1
import requests
2
import bs4
3
4
url = 'https://www.billboard.com/charts/hot-100/'
5
6
song_title = []
7
8
9
res = requests.get(url)
10
soup = bs4.BeautifulSoup(res.text, 'lxml')
11
tag = soup.find_all('???')
12
print(tag)
13
Advertisement
Answer
Try to select the row containers, iterate the ResultSet
get your expected text from the <h3>
:
JavaScript
1
3
1
for e in soup.find_all(attrs={'class':'o-chart-results-list-row-container'}):
2
print(e.h3.get_text(strip=True))
3
Example
If you try to store different information, avoid a bunch of differnet lists. Instead use one list with dict of each item and its information.
JavaScript
1
17
17
1
import requests
2
import bs4
3
4
url = 'https://www.billboard.com/charts/hot-100/'
5
res = requests.get(url)
6
soup = bs4.BeautifulSoup(res.text)
7
8
data = []
9
10
for e in soup.find_all(attrs={'class':'o-chart-results-list-row-container'}):
11
data.append({
12
'title':e.h3.get_text(strip=True),
13
'author':e.h3.find_next('span').get_text(strip=True)
14
})
15
16
data
17