Skip to content
Advertisement

How to iterate a variable in XPATH, extract a link and store it into a list for further iteration

I’m following a Selenium tutorial for an Amazon price tracker (Clever Programming on Youtube) and I got stuck at getting the links from amazon using their techniques.

tutorial link: https://www.youtube.com/watch?v=WbJeL_Av2-Q&t=4315s

I realized the problem laid on the fact that I’m only getting one link out of the 17 available after doing the product search. I need to get all the links for every product after doing a search and them use then to get into each product and get their title, seller and price.

funtion get_products_links() should get all links and stores them into a list to be used by the function get_product_info()

JavaScript

At this point get_products_links() only returns one link since I just made ‘i’ a fixed value of 3 to make it work for now.

I was thinking to iterate ‘i’ in some sort so I can save every different PATHs but I don’t know how to implement this.

I’ve tried performing a for loop and append the result into a new list but them the app stops working

Here is the complete code:

JavaScript

Steps to Run the script:

Step 1: install Selenium==3.141.0 into your virtual environment

Step 2: Search for Chrome Drivers on google and download the driver that matches you Chrome version. After download, extract the driver and paste it into your working folder

Step 3: create a file called amazon_config.py and insert the following code:

JavaScript

If you performed the steps correctly you should be able to run the script and it will perform the following:

  1. Go to www.amazon.com
  2. Search for a product (In this case “PS4”)
  3. Get a link for the first product
  4. Visit that product link

Terminal should print:

JavaScript

What I’m not able to do is to get all links and iterate them so the script will visit all links in the first page

If you are able to get all links, the terminal should print:

JavaScript

Advertisement

Answer

I can’t run it so I only guess how I would do it.

I would put all try/except in for-loop, and use links.append() instead of links = [...], and I would use return after exiting loop

JavaScript

But I would also try to use xpath with // to skip most of divs – and maybe if I would skip div[{i}] then I could get all products without for-loop.


BTW:

In get_products_info() I see similar problem – you create empty list product = [] but later in loop you assing value to product = ... so you remove previous value from product. It would need product.append() to keep all values.

Something like

JavaScript
User contributions licensed under: CC BY-SA
3 People found this is helpful
Advertisement