Skip to content
Advertisement

How can I scrape a href that is hidden behind a placeholder?

I’m trying to scrape the below href from a site. There are several hrefs on the site which I intend to scrape and so I am looping through the site in order to store them all in one list. Below is an example of one of the hrefs.

<div class="col-md-4 h-gutter">
   <div class="product box" data-productid="2111214"> 
      <a href="/products/examples/product1/"> 
         <h3>Product 1</h3> 
         <div class="product-small-text">

Here is the section of my code in question. Commented out is my attempt to gather just the hrefs. As this is not working, for now I’m attempting to scrape the entirety of “col-md-4 h-gutter”

for product in soup.select('div.product.box'):
    link.append(product)
    #link.append(product.a['href'])

print(link)

Below is what is being printed to terminal. As you can see the hrefs are hidden behind a placeholder.

</div>, <div class="product placeholder-container box"> 
<h3><span class="placeholder-text--long"></span></h3> 
<div class="product-small-text"> 
<span class="placeholder-text--short"></span> 
</div>

How do I print out the value of href?

Advertisement

Answer

It’s much easier to use the json response. If you need it in a table form, just feed that into pandas:

import requests
import pandas as pd

url = 'https://www.masterofmalt.com/api/v2/lightningdeals/?isVatableCountry=1&deliveryCountryId=464&filter=nodrams&_=1617024330709&format=json'
headers={'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/88.0.4324.190 Safari/537.36'}
jsonData = requests.get(url, headers=headers).json()

df = pd.DataFrame(jsonData['lightningDeals'])

Output: first 5 rows of 43 rows

print(df.head(5).to_string())
                                                              productUrl                                                                         productImageUrl  productRating  productReviewCount  productVolume  productAbv                categories                   endDateUtc  productId              productName  dealPrice  previousPrice  timeRemaining  saving  percentageClaimed  isActive  dailyDeal
0                      /whiskies/tobermory/tobermory-12-year-old-whisky/                      /whiskies/p-IMAGEPRESET/tobermory/tobermory-12-year-old-whisky.jpg            5.0                  17             70        46.3   [Whiskies, Single Malt]  2021-04-04T22:57:00.0000000      87989    Tobermory 12 Year Old      34.85          39.85         550379     5.0           0.669725      True      False
1  /whiskies/elements-of-islay/peat-pure-islay-elements-of-islay-whisky/  /whiskies/p-IMAGEPRESET/elements-of-islay/peat-pure-islay-elements-of-islay-whisky.jpg            0.0                   0             50        45.0  [Whiskies, Blended Malt]  2021-04-04T22:59:00.0000000      58061          Peat Pure Islay      23.94          28.94         550499     5.0           0.625000      True      False
2                                 /mezcal/ilegal/ilegal-reposado-mezcal/                                 /mezcal/p-IMAGEPRESET/ilegal/ilegal-reposado-mezcal.jpg            5.0                   3             70        40.0        [Mezcal, Reposado]  2021-04-04T22:59:00.0000000       9277          Ilegal Reposado      53.40          59.40         550499     6.0           0.500000      True      False
3                        /whiskies/nikka/nikka-coffey-grain-whisky-70cl/                        /whiskies/p-IMAGEPRESET/nikka/nikka-coffey-grain-whisky-70cl.jpg            4.5                  40             70        45.0         [Whiskies, Grain]  2021-04-04T22:57:00.0000000      32316  Nikka Coffey Grain 70cl      49.83          54.83         550379     5.0           0.410256      True      False
4                              /rum/satchmo/satchmo-mojito-spirited-rum/                              /rum/p-IMAGEPRESET/satchmo/satchmo-mojito-spirited-rum.jpg            5.0                  14             70        37.5             [Rum, Spiced]  2021-04-04T22:58:00.0000000     106576              Satchmo Rum      34.95          39.95         550439     5.0           0.338710      True      False
User contributions licensed under: CC BY-SA
10 People found this is helpful
Advertisement