Skip to content
Advertisement

Scraping all entries of lazyloading page using python

See this page with ECB press releases. These go back to 1997, so it would be nice to automate getting all the links going back in time.

I found the tag that harbours the links ('//*[@id="lazyload-container"]'), but it only gets the most recent links.

How to get the rest?

JavaScript

Advertisement

Answer

The data is loaded via JavaScript from another URL. You can use this example how to load the releases from different years:

JavaScript

Prints:

JavaScript

EDIT: To print links:

JavaScript

Prints:

JavaScript
User contributions licensed under: CC BY-SA
6 People found this is helpful
Advertisement