Skip to content
Advertisement

unable to scrape website pages with unchanged url – python

im trying to get the names of all games within this website “https://slotcatalog.com/en/The-Best-Slots#anchorFltrList”.To do so im using the following code:

import requests
from bs4 import BeautifulSoup

headers = {'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/50.0.2661.102 Safari/537.36'}

url = "https://slotcatalog.com/en/The-Best-Slots#anchorFltrList"

page = requests.get(url, headers=headers)
soup = BeautifulSoup(page.content, 'html.parser')

data = []
table = soup.find_all('div', attrs={'class':'providerCard'})

for game in range(0,len(table)-1):
    print(table[game].find('a')['title'])

and i get what i want. I would like to replicate the same across all pages available on the website, but given that the url is not changing, I looked at the network (XMR) events on the page happening when clicking on a different page and I tried to send a request using the following code:

for page_no in range(1, 100):
    data = {
            "blck":"fltrGamesBlk",
            "ajax":"1",
            "lang":"end",
            "p":str(page_no),
            "translit":"The-Best-Slots",
            "tag":"TOP",
            "dt1":"",
            "dt2":"",
            "sorting":"SRANK",
            "cISO":"GB",
            "dt_period":"",
            "rtp_1":"50.00",
            "rtp_2":"100.00",
            "max_exp_1":"2.00",
            "max_exp_2":"250000.00",
            "min_bet_1":"0.01",
            "min_bet_2":"5.00",
            "max_bet_1":"3.00",
            "max_bet_2":"10000.00"
        }
     page = requests.post('https://slotcatalog.com/index.php', 
                         data=data, 
                         headers={'Host' : 'slotcatalog.com',
                                  'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:82.0) Gecko/20100101 Firefox/82.0'    
                })


    soup = BeautifulSoup(page.content, 'html.parser')
    for row in soup.find_all('div', attrs={'class':'providerCard'}):
        name = row.find('a')['title']
        print(name)
        

result : (“KeyError: ‘title'”) – meaning that its not finding the class “providerCard”. Has the request to the website been done in the wrong way? If so, where should i change the code? thanks in advance

Advertisement

Answer

Alright, so, you had a typo. XD It was this "lang":"end" from the payload but it should have been "lang": "en", among other things.

Anyhow, I’ve cleaned your code up a bit and it works as expected. You can keep looping for all the games, if you want.

import requests
from bs4 import BeautifulSoup

headers = {
    "referer": "https://slotcatalog.com/en/The-Best-Slots",
    "User-Agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_5) "
                  "AppleWebKit/537.36 (KHTML, like Gecko) "
                  "Chrome/50.0.2661.102 Safari/537.36",
    "x-requested-with": "XMLHttpRequest",
}

payload = {
    "blck": "fltrGamesBlk",
    "ajax": "1",
    "lang": "en",
    "p": 1,
    "translit": "The-Best-Slots",
    "tag": "TOP",
    "dt1": "",
    "dt2": "",
    "sorting": "SRANK",
    "cISO": "EN",
    "dt_period": "",
    "rtp_1": "50.00",
    "rtp_2": "100.00",
    "max_exp_1": "2.00",
    "max_exp_2": "250000.00",
    "min_bet_1": "0.01",
    "min_bet_2": "5.00",
    "max_bet_1": "3.00",
    "max_bet_2": "10000.00"
}
page = requests.post(
    "https://slotcatalog.com/index.php",
    data=payload,
    headers=headers,
)
soup = BeautifulSoup(page.content, "html.parser")
print([i.get("title") for i in soup.find_all("a", {"class": "providerName"})])


Output (for page 1 only):

['Starburst', 'Bonanza', 'Rainbow Riches', 'Book of Dead', "Fishin' Frenzy", 'Wolf Gold', 'Twin Spin', 'Slingo Rainbow Riches', "Gonzo's Quest", "Gonzo's Quest Megaways", 'Eye of Horus (Reel Time Gaming)', 'Age of the Gods God of Storms', 'Lightning Roulette', 'Buffalo Blitz', "Fishin' Frenzy Megaways", 'Fluffy Favourites', 'Blue Wizard', 'Legacy of Dead', '9 Pots of Gold', 'Buffalo Blitz II', 'Cleopatra (IGT)', 'Quantum Roulette', 'Reel King Mega', 'Mega Moolah', '7s Deluxe', "Rainbow Riches Pick'n'Mix", "Shaman's Dream"]
User contributions licensed under: CC BY-SA
4 People found this is helpful
Advertisement