So I’m Trying To Get The Review Precent Of Each Amount of Stars In an Amazon Product Page.
This Is The Output I want To Get:
JavaScript
x
6
1
Awesome Feedback: 72%
2
Good Feedback: 15%
3
Regular Feedback: 7%
4
Bad Feedback: 3%
5
Awful Feedback: 4%
6
And So Far This Is The Output I Got:
JavaScript
1
11
11
1
Awesome Feedback: 72%
2
Traceback (most recent call last):
3
File "c:UsersNanaDesktopstuffPythonWeb ScrapingAmazon Smart
4
BuyeramazonR.py", line 34, in <module> bot()
5
File "c:UsersNanaDesktopstuffPythonWeb ScrapingAmazon Smart
6
BuyeramazonR.py", line 14, in __init__ self.r()
7
File "c:UsersNanaDesktopstuffPythonWeb ScrapingAmazon Smart
8
BuyeramazonR.py", line 26, in r
9
print(f'Good Feedback: {self.pd[1]}')
10
IndexError: list index out of range
11
As You see, I Have Managed To Get The Awesome Feedback Working But Not The Other Ones… The problem is that I got all the precentages in isolated list and every precntage has his one list. As you see here:
JavaScript
1
2
1
['72%'], ['15%'], ['7%'], ['3%'], ['4%']
2
I’m quite straggling with it… If there is a way to access all of the indexes of the for loop and merge them all into one list, please share it with me… here is my code:
JavaScript
1
33
33
1
from bs4 import BeautifulSoup
2
from selenium import webdriver
3
4
5
6
7
class bot:
8
def __init__(self):
9
self.path = 'C:/Users/Nana/Desktop/stuff/Python/Web Scraping/chromedriver.exe'
10
self.browser = webdriver.Chrome(self.path)
11
self.browser.get('https://www.amazon.com/%D7%9E%D7%A7%D7%9C%D7%93%D7%AA-%D7%9E%D7%95%D7%90%D7%A8%D7%AA-%D7%91%D7%A6%D7%91%D7%A2%D7%99-%D7%95%D7%A2%D7%9B%D7%91%D7%A8-%D7%9C%D7%92%D7%99%D7%99%D7%9E%D7%99%D7%A0%D7%92/dp/B016Y2BVKA/ref=sr_1_1_sspa?dchild=1&keywords=keyboard&qid=1633809059&sr=8-1-spons&psc=1&smid=A3TJEO884AOUB3&spLa=ZW5jcnlwdGVkUXVhbGlmaWVyPUFCRTg0S1dWNjRTQUMmZW5jcnlwdGVkSWQ9QTA2NTEwNzgzNFdKSVA5NEpQODRQJmVuY3J5cHRlZEFkSWQ9QTAwMjcwNDExUFJOUjA4U0pEWDlRJndpZGdldE5hbWU9c3BfYXRmJmFjdGlvbj1jbGlja1JlZGlyZWN0JmRvTm90TG9nQ2xpY2s9dHJ1ZQ==')
12
self.r()
13
14
15
def r(self):
16
self.soup = BeautifulSoup(self.browser.page_source, 'lxml')
17
self.div5 = self.soup.find('div', id = 'reviewsMedley')
18
self.tbody = self.div5.find('tbody')
19
self.trs = self.tbody.find_all('tr')
20
for self.tr in self.trs:
21
self.precents = self.tr.find('td', class_ = 'a-text-right a-nowrap')
22
self.pd = [self.precents.text.strip()]
23
print(f'Awesome Feedback: {self.pd[0]}')
24
print(f'Good Feedback: {self.pd[1]}')
25
print(f'Regular Feedback: {self.pd[2]}')
26
print(f'Bad Feedback: {self.pd[3]}')
27
print(f'Awful Feedback: {self.pd[4]}')
28
29
30
31
32
bot()
33
Advertisement
Answer
There are two Options #1 define pd
as empty list
outside the loop, append each result of iteration and also print outside the loop or do the following:
Example
JavaScript
1
9
1
def r(self):
2
self.soup = BeautifulSoup(self.browser.page_source, 'lxml')
3
self.pd = [x.text.strip() for x in self.soup.select('div#reviewsMedley tr td.a-text-right.a-nowrap')]
4
print(f'Awesome Feedback: {self.pd[0]}')
5
print(f'Good Feedback: {self.pd[1]}')
6
print(f'Regular Feedback: {self.pd[2]}')
7
print(f'Bad Feedback: {self.pd[3]}')
8
print(f'Awful Feedback: {self.pd[4]}')
9
Output:
JavaScript
1
6
1
Awesome Feedback: 72%
2
Good Feedback: 15%
3
Regular Feedback: 7%
4
Bad Feedback: 3%
5
Awful Feedback: 4%
6