So basically I am adding this portion to my code and I have no clue whats going on. This is the link i am using https://www.digikey.com/products/en?keywords=ID82C55 All in the same Process: -So my css selector returns none. -Then it finds a couple of the html elements returns some of them. -Then finds the last element.
So this is causing my program to mix match data and yields it incorrectly to my csv file. If anyone could tell me what the problem is here? Thanks.
Code
def parse(self, response): for b in response.css('div#pdp_content.product-details > div'): if b.css('div.product-details-headline h1::text').get(): part = b.css('div.product-details-headline h1::text').get() part = part.strip() parts1 = part print(b.css('div.product-details-headline h1::text').get()) print(parts1) else: print(b.css('div.product-details-headline h1::text').get()) if b.css('table.product-dollars > tr:nth-last-child(1) td:nth-last-child(3)::text').get(): cleaned_quantity = b.css('table.product-dollars > tr:nth-last-child(1) td:nth-last-child(3)::text').get() print(cleaned_quantity) else: print(b.css('table.product-dollars > tr:nth-last-child(1) td:nth-last-child(3)::text').get()) if b.css('table.product-dollars > tr:nth-last-child(1) td:nth-last-child(2)::text').get(): cleaned_price = b.css('table.product-dollars > tr:nth-last-child(1) td:nth-last-child(2)::text').get() print(cleaned_price) else: print(b.css('table.product-dollars > tr:nth-last-child(1) td:nth-last-child(2)::text').get()) if b.css('div.quantity-message span#dkQty::text').get(): cleaned_stock = b.css('div.quantity-message span#dkQty::text').get() print(cleaned_stock) else: print(b.css('div.quantity-message span#dkQty::text').get()) if b.css('table#product-attribute-table > tr:nth-child(7) td::text').get(): status = b.css('table#product-attribute-table > tr:nth-child(7) td::text').get() status = status.strip() cleaned_status = status print(cleaned_status) else: print(b.css('table#product-attribute-table > tr:nth-child(7) td::text').get()) # yield { # 'Part': parts1, # 'Quantity': cleaned_quantity, # 'Price': cleaned_price, # 'Stock': cleaned_stock, # 'Status': cleaned_status, # }
Output
None None None None None None 2,500 29.10828 29 None ID82C55A ID82C55A None None None Active
Advertisement
Answer
I highly recommend you to switch to XPath expressions:
part_number = b.xpath('.//th[.="Manufacturer Part Number"]/following-sibling::td[1]/text()').get() stock = b.xpath('.//span[.="In Stock"]/preceding-sibling::span[1]/text()').get() etc.