I am trying to gather the first two pages products names on Amazon based on seller name. When I request the page, it has all elements I need ,however, when I use BeautifulSoup
– they are not being listed. Here is my code:
JavaScript
x
8
1
import requests
2
from bs4 import BeautifulSoup
3
headers = {'User-Agent':'Mozilla/5.0'}
4
res = requests.get("https://www.amazon.com/s?me=A3WE363L17WQR&marketplaceID=ATVPDKIKX0DER", headers=headers)
5
#print(res.text)
6
soup = BeautifulSoup(res.text, "html.parser")
7
soup.find_all("a",href=True)
8
The links of products are not listed. If the Amazon API gives this information, I am open to use it (please provide some examples of its usage). Thanks a lot in advance.
Advertisement
Answer
I have extracted product names from alt attribute. Is this as intended?
JavaScript
1
8
1
import requests
2
from bs4 import BeautifulSoup as bs
3
4
r = requests.get('https://www.amazon.com/s?me=A3WE363L17WQR&marketplaceID=ATVPDKIKX0DER')
5
soup = bs(r.content, 'lxml')
6
items = [item['alt'] for item in soup.select('.a-link-normal [alt]')]
7
print(items)
8
Over two pages:
JavaScript
1
9
1
import requests
2
from bs4 import BeautifulSoup as bs
3
url = 'https://www.amazon.com/s?i=merchant-items&me=A3WE363L17WQR&page={}&marketplaceID=ATVPDKIKX0DER&qid=1553116056&ref=sr_pg_{}'
4
for page in range(1,3):
5
r = requests.get(url.format(page,page))
6
soup = bs(r.content, 'lxml')
7
items = [item['alt'] for item in soup.select('.a-link-normal [alt]')]
8
print(items)
9