Skip to content
Advertisement

How to troubleshoot Scrapy shell response 403 error

A few months ago I followed this Scrapy shell method to scrape a real estate listings webpage and it worked perfectly.

I pulled my cookie and user-agent text from Firefox (Developer tools -> Headers) when the target URL is loaded, and I would get a successful response (200) and be able to pull items from response.xpath.

For example:

JavaScript

Now I’m trying again a few months later (with an updated cookie) and I’m getting a 403 error — the server understands the request but refuses to authorize it:

JavaScript

Any thoughts on what I might try to get this working again? Thanks.

Advertisement

Answer

The cookie is not what’s causing the problem. (see below) I think the issue here is that with ‘view=map’, its looking for a ‘referer’ key in the header dict (in addition to other header keys). I would suggest adding a key/pair of ‘referer’:”url” in your headers. Alternatively you can try less heavy approach:

JavaScript

output:

JavaScript
User contributions licensed under: CC BY-SA
7 People found this is helpful
Advertisement