Skip to content
Advertisement

Web-scraping return empty values: possible protected site

I’m working with web-scraping from www.albumoftheyear.org, but in my code I can only get an empty df.

I don’t know if the site is protected with some cloudflare and if this is a cause or I’m making a mistake with the selected tags.

The basic idea is to iterate through the pages and collect the data (title, year, genre) from the albums and create a df (pandas).

Here is the code developed:

JavaScript

Advertisement

Answer

A working solution using selenium. Note you need to have the webdriver for your browser on your system. I am using Chrome and the chromedriver can be gotten from here. Yes you need both the browser and the driver.

JavaScript

output

JavaScript

If you do not want the albumListRank change this line from

JavaScript

to

JavaScript
User contributions licensed under: CC BY-SA
5 People found this is helpful
Advertisement