Skip to content
Advertisement

Scrape eBay Sold Items Using Selenium Returns []

I have almost no webscraping experience, and wasn’t able to solve this using BeautifulSoup, so I’m trying selenium (installed it today). I’m trying to scrape sold items on eBay. I’m trying to scrape:

https://www.ebay.com/sch/i.html?_from=R40&_nkw=oakley+sunglasses&_sacat=0&Brand=Oakley&rt=nc&LH_Sold=1&LH_Complete=1&_ipg=200&_oaa=1&_fsrp=1&_dcat=79720

Here is my code where I load in html code and convert to selenium html:

    ebay_url = 'https://www.ebay.com/sch/i.html?_from=R40&_nkw=oakley+sunglasses&_sacat=0&Brand=Oakley&rt=nc&LH_Sold=1&LH_Complete=1&_ipg=200&_oaa=1&_fsrp=1&_dcat=79720'

    html = requests.get(ebay_url)
    #print(html.text)

    driver = wd.Chrome(executable_path=r'/Users/mburley/Downloads/chromedriver')
    driver.get(ebay_url)

Which correctly opens a new chrome session at the correct url. I’m working on getting the titles, prices, and date sold and then loading it into a csv file. Here is the code I have for those:

    # Find all div tags and set equal to main_data
    all_items = driver.find_elements_by_class_name("s-item__info clearfix")[1:]
    #print(main_data)

    # Loop over main_data to extract div classes for title, price, and date
    for item in all_items:
    date = item.find_element_by_xpath("//span[contains(@class, 'POSITIVE']").text.strip()
    title = item.find_element_by_xpath("//h3[contains(@class, 's-item__title s-item__title--has-tags']").text.strip()
    price = item.find_element_by_xpath("//span[contains(@class, 's-item__price']").text.strip()

    print('title:', title)
    print('price:', price)
    print('date:', date)
    print('---')
    data.append( [title, price, date] )

Which just returns []. I think ebay may be blocking my IP, but the html code loads in and looks correct. Hopefully someone can help! Thanks!

Advertisement

Answer

You can use the below code to scrape the details. also you can use pandas to store data in csv file.

Code :

ebay_url = 'https://www.ebay.com/sch/i.html?_from=R40&_nkw=oakley+sunglasses&_sacat=0&Brand=Oakley&rt=nc&LH_Sold=1&LH_Complete=1&_ipg=200&_oaa=1&_fsrp=1&_dcat=79720'

html = requests.get(ebay_url)
# print(html.text)

driver = wd.Chrome(executable_path=r'/Users/mburley/Downloads/chromedriver')
driver.maximize_window()
driver.implicitly_wait(30)
driver.get(ebay_url)


wait = WebDriverWait(driver, 20)
sold_date = []
title = []
price = []
i = 1
for item in driver.find_elements(By.XPATH, "//div[contains(@class,'title--tagblock')]/span[@class='POSITIVE']"):
    sold_date.append(item.text)
    title.append(driver.find_element_by_xpath(f"(//div[contains(@class,'title--tagblock')]/span[@class='POSITIVE']/ancestor::div[contains(@class,'tag')]/following-sibling::a/h3)[{i}]").text)
    price.append(item.find_element_by_xpath(f"(//div[contains(@class,'title--tagblock')]/span[@class='POSITIVE']/ancestor::div[contains(@class,'tag')]/following-sibling::div[contains(@class,'details')]/descendant::span[@class='POSITIVE'])[{i}]").text)
    i = i + 1

print(sold_date)
print(title)
print(price)

data = {
         'Sold_date': sold_date,
         'title': title,
         'price': price
        }
df = pd.DataFrame.from_dict(data)
df.to_csv('out.csv', index = 0)

Imports :

import pandas as pd
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
User contributions licensed under: CC BY-SA
8 People found this is helpful
Advertisement