Skip to content
Advertisement

Trying to add multiple yields into a single json file using Scrapy

I am trying to figure out if my scrapy tool is correctly hitting the product_link for the request callback – ‘yield scrapy.Request(product_link, callback=self.parse_new_item)’ product_link should be ‘https://www.antaira.com/products/10-100Mbps/LNX-500A’ but I have not been able to confirm if my program is jumping into the next step created so that I can retrieve the correct yield return. Thank you!

JavaScript

Advertisement

Answer

You have a couple of issues:

  1. scrapy items are essentially dictionaries and are therefore mutable. You need to create a unique item for each and every yield statement.

  2. your second parse callback is referencing a variable items that it doesn’t have access too because it was defined in your first parse callback.

  3. In your urljoin method you are using a string literal instead of a variable for rel_product_link

In the example below I fixed those issues and made some additional notes

JavaScript
User contributions licensed under: CC BY-SA
4 People found this is helpful
Advertisement