Skip to content
Advertisement

Connection timeouts as a protection from site scraping?

I am new to Python and Web scraping but it’s been two weeks that I periodically scrape one website and successfully download images from it. I use different proxies and sometimes change them. But starting yesterday all my proxies suddenly stopped working with a timeout error. I’ve tried a whole list of them and all fail. Could this be a kind of site protection from scraping? If yes, is there a way to overcome it?

JavaScript

Error message:

JavaScript

Advertisement

Answer

This will GET the URL and retry 3 times in case of ConnectTimeoutError. It will help to apply delays between attempts to avoid failing again in case of periodic request quota.

Take a look at urllib3.util.retry.Retry, it has many options to simplify retries.

JavaScript
User contributions licensed under: CC BY-SA
2 People found this is helpful
Advertisement