I have been playing around with beautiful soup trying to learn it. So far ive learned some stuff but im struggle to put my use case together. how do i print, both movieslist and moviescore text only appended together? appreciate the help and info. really enjoying python and some of its applications like web scraping.
import requests from bs4 import BeautifulSoup result = requests.get("https://www.rottentomatoes.com/browse/opening") print("Checking Website") print(result.status_code) print("Gathering Website data and preparing it for presentation") src = result.content soup = BeautifulSoup(src, 'lxml') movielist = soup.find_all("div",attrs={"class":"media-list__title"}) moviescore = soup.find_all("span",attrs={"class":"tMeterScore"}) for movielist in soup.find_all("div",attrs={"class":"media-list__title"}): print (movielist.text)
Advertisement
Answer
The key here is to “zip” the two lists you have. But before this happens you need to get the text value from each element and strip it.
Here’s a slight modification of your code:
import requests from bs4 import BeautifulSoup result = requests.get("https://www.rottentomatoes.com/browse/opening") print("Checking Website") print(result.status_code) print("Gathering Website data and preparing it for presentation") soup = BeautifulSoup(result.content, 'lxml') # get each movie title and remove any whitespace characters movies = [ title.getText(strip=True) for title in soup.find_all("div", attrs={"class": "media-list__title"}) ] # get each movie score, remove any whitespace chars, and replace '- -' # with a custom message -> No score yet. :( movie_scores = [ score.getText(strip=True).replace("- -", "No score yet. :(") for score in soup.select(".media-list__meter-container") # introducing css selectors :) ] for movie_data in zip(movies, movie_scores): # zipping the two lists title, score = movie_data # this outputs a tuple: (MOVIE_TITLE, MOVIE_SCORE) print(f"{title}: {score}")
Output:
Checking Website 200 Gathering Website data and preparing it for presentation The Courier: 79% The Heiress: No score yet. :( The Stay: No score yet. :( City of Lies: 50% Happily: 70% Doors: No score yet. :( Last Call: No score yet. :( Enforcement: 100% Phobias: No score yet. :( Dark State: No score yet. :( Food Club: 83% Wojnarowicz: 100%