So basically I am adding this portion to my code and I have no clue whats going on. This is the link i am using https://www.digikey.com/products/en?keywords=ID82C55 All in the same Process: -So my css selector returns none. -Then it finds a couple of the html elements returns some of them. -Then finds the last element.
So this is causing my program to mix match data and yields it incorrectly to my csv file. If anyone could tell me what the problem is here? Thanks.
Code
JavaScript
x
51
51
1
def parse(self, response):
2
3
4
for b in response.css('div#pdp_content.product-details > div'):
5
6
if b.css('div.product-details-headline h1::text').get():
7
part = b.css('div.product-details-headline h1::text').get()
8
part = part.strip()
9
parts1 = part
10
print(b.css('div.product-details-headline h1::text').get())
11
print(parts1)
12
13
else:
14
print(b.css('div.product-details-headline h1::text').get())
15
16
if b.css('table.product-dollars > tr:nth-last-child(1) td:nth-last-child(3)::text').get():
17
cleaned_quantity = b.css('table.product-dollars > tr:nth-last-child(1) td:nth-last-child(3)::text').get()
18
print(cleaned_quantity)
19
else:
20
print(b.css('table.product-dollars > tr:nth-last-child(1) td:nth-last-child(3)::text').get())
21
if b.css('table.product-dollars > tr:nth-last-child(1) td:nth-last-child(2)::text').get():
22
cleaned_price = b.css('table.product-dollars > tr:nth-last-child(1) td:nth-last-child(2)::text').get()
23
print(cleaned_price)
24
25
else:
26
print(b.css('table.product-dollars > tr:nth-last-child(1) td:nth-last-child(2)::text').get())
27
if b.css('div.quantity-message span#dkQty::text').get():
28
cleaned_stock = b.css('div.quantity-message span#dkQty::text').get()
29
print(cleaned_stock)
30
31
else:
32
print(b.css('div.quantity-message span#dkQty::text').get())
33
34
if b.css('table#product-attribute-table > tr:nth-child(7) td::text').get():
35
status = b.css('table#product-attribute-table > tr:nth-child(7) td::text').get()
36
status = status.strip()
37
cleaned_status = status
38
print(cleaned_status)
39
40
else:
41
print(b.css('table#product-attribute-table > tr:nth-child(7) td::text').get())
42
43
# yield {
44
# 'Part': parts1,
45
# 'Quantity': cleaned_quantity,
46
# 'Price': cleaned_price,
47
# 'Stock': cleaned_stock,
48
# 'Status': cleaned_status,
49
# }
50
51
Output
JavaScript
1
19
19
1
None
2
None
3
None
4
None
5
None
6
None
7
2,500
8
29.10828
9
29
10
None
11
12
ID82C55A
13
14
ID82C55A
15
None
16
None
17
None
18
Active
19
Advertisement
Answer
I highly recommend you to switch to XPath expressions:
JavaScript
1
4
1
part_number = b.xpath('.//th[.="Manufacturer Part Number"]/following-sibling::td[1]/text()').get()
2
stock = b.xpath('.//span[.="In Stock"]/preceding-sibling::span[1]/text()').get()
3
etc.
4