I am trying to gather the first two pages products names on Amazon based on seller name. When I request the page, it has all elements I need ,however, when I use BeautifulSoup
– they are not being listed. Here is my code:
import requests from bs4 import BeautifulSoup headers = {'User-Agent':'Mozilla/5.0'} res = requests.get("https://www.amazon.com/s?me=A3WE363L17WQR&marketplaceID=ATVPDKIKX0DER", headers=headers) #print(res.text) soup = BeautifulSoup(res.text, "html.parser") soup.find_all("a",href=True)
The links of products are not listed. If the Amazon API gives this information, I am open to use it (please provide some examples of its usage). Thanks a lot in advance.
Advertisement
Answer
I have extracted product names from alt attribute. Is this as intended?
import requests from bs4 import BeautifulSoup as bs r = requests.get('https://www.amazon.com/s?me=A3WE363L17WQR&marketplaceID=ATVPDKIKX0DER') soup = bs(r.content, 'lxml') items = [item['alt'] for item in soup.select('.a-link-normal [alt]')] print(items)
Over two pages:
import requests from bs4 import BeautifulSoup as bs url = 'https://www.amazon.com/s?i=merchant-items&me=A3WE363L17WQR&page={}&marketplaceID=ATVPDKIKX0DER&qid=1553116056&ref=sr_pg_{}' for page in range(1,3): r = requests.get(url.format(page,page)) soup = bs(r.content, 'lxml') items = [item['alt'] for item in soup.select('.a-link-normal [alt]')] print(items)