Whenever i try to extract the data, it returns an output of “None” which I am not sure of is it the code (I followed the rules of using bs4) or is it just the website that’s different to scrape?
My code:
import requests import bs4 as bs url = 'https://www.zomato.com/jakarta/pondok-indah-restaurants' req = requests.get(url, headers={'User-Agent': 'Mozilla/5.0'}) html = req.text soup = bs.BeautifulSoup(html, "html.parser") listings = soup.find('div', class_='sc-gAmQfK fKxEbD') rest_name = listings.find('h4', class_='sc-1hp8d8a-0 sc-eTyWNx gKsZcT').text ##Output: AttributeError: 'NoneType' object has no attribute 'find' print(listings) ##returns None
Here is the inspected tag of the website which i try to get the h4 class showing the restaurant’s name:
Advertisement
Answer
What happens?
Classes are generated dynamically and may differ from your inspections via developer tools – So you won’t find what you are looking for.
How to fix?
It would be a better approach to select your targets via tag
or id
if available, cause these are more static than css classes
.
listings = soup.select('a:has(h4)')
Example
Iterating listings and scrape several infromation:
import requests import bs4 as bs url = 'https://www.zomato.com/jakarta/pondok-indah-restaurants' req = requests.get(url, headers={'User-Agent': 'Mozilla/5.0'}) html = req.text soup = bs.BeautifulSoup(html, "html.parser") data = [] for item in soup.select('a:has(h4)'): data.append({ 'title':item.h4.text, 'url':item['href'], 'etc':'...' }) print(data)
Output
[{'title': 'Radio Dalam Diner', 'url': '/jakarta/radio-dalam-diner-pondok-indah/info', 'etc': '...'}, {'title': 'Aneka Bubur 786', 'url': '/jakarta/aneka-bubur-786-pondok-indah/info', 'etc': '...'}, {'title': "McDonald's", 'url': '/jakarta/mcdonalds-pondok-indah/info', 'etc': '...'}, {'title': 'KOPIKOBOY', 'url': '/jakarta/kopikoboy-pondok-indah/info', 'etc': '...'}, {'title': 'Kopitelu', 'url': '/jakarta/kopitelu-pondok-indah/info', 'etc': '...'}, {'title': 'KFC', 'url': '/jakarta/kfc-pondok-indah/info', 'etc': '...'}, {'title': 'HokBen Delivery', 'url': '/jakarta/hokben-delivery-pondok-indah/info', 'etc': '...'}, {'title': 'PHD', 'url': '/jakarta/phd-pondok-indah/info', 'etc': '...'}, {'title': 'Casa De Jose', 'url': '/jakarta/casa-de-jose-pondok-indah/info', 'etc': '...'}]