Skip to content
Advertisement

‘406 Not Acceptable’ after scaping web using python

The website i scapped blocked me out by showing 406 Not Acceptable on the browser. It might i mistakenly sent too many requests at once on phython code.

So i put time.sleep(10) for each loop to not make it look like a DDoS attack, and it seems worked out.

My questions are:

  1. How long would it be reasonable to send between each request? Sleep 10 seconds for each loop makes my code running too slow.

  2. How to fix the 406 Not Acceptable error on my browsers? They still block me out, only if i chance my ip address but it’s not permanent solution.

Thank you all for your answers and comments. Good day!

Advertisement

Answer

Any rate-limit errors are all subject to which website you choose to scrape / interact with. I could set up a website that only allows you to view it once per day, before throwing HTTP errors at your screen. So to answer your first question, there is no definitive answer. You must test for yourself and see what’s the fastest speed you can go, without getting blocked.

However, there is a workaround. If you use proxies, then it’s almost impossible to detect and stop the requests from executing, and therefore you will not be hit by any HTTP errors. HOWEVER, JUST BECAUSE YOU CAN, DOESN’T MEAN THAT YOU SHOULD- I am a programmer, not a lawyer. I’m sure there’s a rule somewhere that says that spamming a page, even after it tells you to stop, is illegal.

Your second question isn’t exactly related to programming, but I will answer it anyways- try clearing your cookies or refreshing your IP (try using a VPN or such). Other than changing your IP or cookies, there’s not many more ways that a page can fingerprint you (in order to block you).

User contributions licensed under: CC BY-SA
5 People found this is helpful
Advertisement