I’m new to web scraping and I’m trying to build a very basic stock tracker for the site pokemoncenter.com. When visiting the product pages of items on the live site, the add to cart button displays as:
<button type="button" class="jsx-2748458255 product-add btn btn-secondary">Add to Cart</button>
When the item is out of stock the button is:
<button type="button" disabled="" class="jsx-2748458255 product-add btn btn-tertiary disabled">Out of Stock</button>
But whenever I try to scrape the site, regardless of whether the item is in stock or not, the button is:
<button class="jsx-2748458255 product-add btn btn-tertiary disabled" disabled="" type="button"></button>
So essentially it always displays as out of stock when I download the html code with requests.get().
import bs4 from bs4 import BeautifulSoup as soup from urllib.request import urlopen, Request import requests page_url = "https://www.pokemoncenter.com/product/701-00364/primal-groudon-poke-plush-17-3-4-in" headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/87.0.4280.141 Safari/537.36'} req = requests.get(page_url, headers = headers) page_soup = soup(req.text, "html.parser") #Find add to cart button divs = page_soup.findAll("div", {"class" : "jsx-829839431 product-col"}) button = str(divs[1].find("button", {"class" : "jsx-2748458255"})) #Check if button is disabled or not if (button.find('disabled') != -1): print("Out of Stock") else: print("In Stock")
In stock example: https://www.pokemoncenter.com/product/701-00364/primal-groudon-poke-plush-17-3-4-in
Out of stock example: https://www.pokemoncenter.com/product/701-06558/gigantamax-pikachu-poke-plush-17-in
Advertisement
Answer
As goalie1998 mentioned, the site could be using javascript to only load necessary images first to reduce initial load time. You could probably still use Selenium to scrape that website since it can imitate browser behavior.