I have been playing around with beautiful soup trying to learn it. So far ive learned some stuff but im struggle to put my use case together. how do i print, both movieslist and moviescore text only appended together? appreciate the help and info. really enjoying python and some of its applications like web scraping.
JavaScript
x
13
13
1
import requests
2
from bs4 import BeautifulSoup
3
result = requests.get("https://www.rottentomatoes.com/browse/opening")
4
print("Checking Website")
5
print(result.status_code)
6
print("Gathering Website data and preparing it for presentation")
7
src = result.content
8
soup = BeautifulSoup(src, 'lxml')
9
movielist = soup.find_all("div",attrs={"class":"media-list__title"})
10
moviescore = soup.find_all("span",attrs={"class":"tMeterScore"})
11
for movielist in soup.find_all("div",attrs={"class":"media-list__title"}):
12
print (movielist.text)
13
Advertisement
Answer
The key here is to “zip” the two lists you have. But before this happens you need to get the text value from each element and strip it.
Here’s a slight modification of your code:
JavaScript
1
29
29
1
import requests
2
from bs4 import BeautifulSoup
3
4
5
result = requests.get("https://www.rottentomatoes.com/browse/opening")
6
7
print("Checking Website")
8
print(result.status_code)
9
print("Gathering Website data and preparing it for presentation")
10
11
soup = BeautifulSoup(result.content, 'lxml')
12
13
# get each movie title and remove any whitespace characters
14
movies = [
15
title.getText(strip=True) for title in
16
soup.find_all("div", attrs={"class": "media-list__title"})
17
]
18
# get each movie score, remove any whitespace chars, and replace '- -'
19
# with a custom message -> No score yet. :(
20
movie_scores = [
21
score.getText(strip=True).replace("- -", "No score yet. :(") for score
22
in soup.select(".media-list__meter-container") # introducing css selectors :)
23
]
24
25
for movie_data in zip(movies, movie_scores): # zipping the two lists
26
title, score = movie_data # this outputs a tuple: (MOVIE_TITLE, MOVIE_SCORE)
27
print(f"{title}: {score}")
28
29
Output:
JavaScript
1
16
16
1
Checking Website
2
200
3
Gathering Website data and preparing it for presentation
4
The Courier: 79%
5
The Heiress: No score yet. :(
6
The Stay: No score yet. :(
7
City of Lies: 50%
8
Happily: 70%
9
Doors: No score yet. :(
10
Last Call: No score yet. :(
11
Enforcement: 100%
12
Phobias: No score yet. :(
13
Dark State: No score yet. :(
14
Food Club: 83%
15
Wojnarowicz: 100%
16