Skip to content
Advertisement

How to scrape reviews from chrome web store for a given extension?

I am trying to use this python code to scrape chrome web store

from lxml import html
import requests
url = 'https://chrome.google.com/webstore/detail/cookie-editor/hlkenndednhfkekhgcdicdfddnkalmdm'
values = {'username': 'myemail@gmail.com',
          'password': 'mypassword'}
page = requests.get(url, data=values)
print(page)
tree = html.fromstring(page.content)
review = tree.xpath('//div[@class="ba-Eb-ba"]/text()')[0]
print(review)

however, I am getting Bad request 400. Is it even possible to scrape chrome web store?

Advertisement

Answer

The webpage’s contents are loaded by JavaScript. So you have to apply an automation tool something like Selenium to grab the right data.

Example:

from selenium import webdriver
import time
from bs4 import BeautifulSoup
from selenium.webdriver.chrome.service import Service
from selenium.webdriver.common.by import By

options = webdriver.ChromeOptions()
options.add_experimental_option("detach", True)
webdriver_service = Service("./chromedriver") #Your chromedriver path
driver = webdriver.Chrome(service=webdriver_service,options=options)

data = []
driver.get('https://chrome.google.com/webstore/detail/cookie-editor/hlkenndednhfkekhgcdicdfddnkalmdm')
driver.maximize_window()
time.sleep(3)

driver.find_element(By.XPATH,'//*[@class="e-f-b-L" and contains(text(),"Review")]').click()
time.sleep(1)

soup = BeautifulSoup(driver.page_source,"html.parser")

data =[]
reviews = soup.select('div.ba-bc-Xb')
for review in reviews:
    name = review.select_one('span[class="comment-thread-displayname"]').get_text(strip=True)
    comment = review.select_one('div[class="ba-Eb-ba"]').get_text(strip=True)

    data.append({
        'name': name,
        'comment': comment
    })

print(data)

      

Oputput:

[{'name': 'PingPing But', 'comment': 'Love it..... so simple and easy to use !'}, {'name': 'Zhou Jeffrey', 'comment': "doesn't work anymore"}, {'name': 'eunice miralles', 'comment': 'same im trying to find a fix and in github they said it has a problem with permission but still not fixed'}, {'name': 'Jade Martinito', 'comment': 'me too'}, {'name': 'Bonafide Champ', 'comment': 'It works fine but it does this weird thing when I import cookies in incognito mode, 
the cookies still get imported in the main browser windows.'}, {'name': 'Arman Nawaz World', 'comment': 'Easy to use this extension. it is very user friendly and simple interface, while other looks little complicatednReview by ArmanxNawaz'}, {'name': 'Bagong Pook Elementary School', 'comment': 'Easy to use! Very helpful'}, {'name': 'Whitelisted', 'comment': 'Works great for development and resetting website cookies without digging through your settings'}, {'name': 'Rehxn Ali', 'comment': 'Best!! Saved Alot of Money With This Extention'}, {'name': 'biniyam demeke', 'comment': 'Oh, Very Helpful'}, {'name': 'Pingu VFX', 'comment': 'Easy to use while scamming kids on their roblox accountes'}, {'name': 'Abstractedjuice09 Z', 'comment': 'how?'}, {'name': 'jd', 'comment': 'lol same'}, {'name': 'Arnells Designs', 'comment': 'good'}, {'name': 'David Galbraith', 'comment': 'How is this called a cookie "editor"?? Not working at all. When I open it, the extension shows  cookies for the page that I'm currently on. It should be able to show cookies from every site I've visited. And if I type ANYTHING in the search, nothing comes up. Not google, not Facebook, not steam, not one site that I have visited or logged into show up in the search bar. There is something very, very wrong. yeah, I can delete ALL cookies, but CCleaner does that just fine.'}, {'name': 'df fes', 'comment': 'Maybe you dont know how to use it?'}, {'name': 'Galih Kamulyan', 'comment': 'LEGENDARY'}, {'name': 'Aniket Chaudhary', 
'comment': 'Liked it. But after using it for sometime, it shows an "unknown error".'}, {'name': 'Anonymous', 'comment': "mine doesn't work for first time too , it always show unknown error"}, {'name': 'Ehsan Abtahee', 'comment': 'did u find a fix?'}, {'name': 'kashba', 'comment': 'if you find a fix.. do tell me'}, {'name': 'Nischay2004 Muller', 'comment': 'The best easy cookie editor for all , strongly recommended'}, {'name': 'ultra noob', 'comment': 'Super simple and easy to use.'}, {'name': 'विकास कालीरामना', 'comment': 'Loved it!'}, {'name': 'Zachary Bolt', 'comment': 'Clean, easy to use and actively updated. 5 Stars well earned.'}, {'name': 'TALHA JUBAYER', 'comment': "Love it .it's 
working"}, {'name': 'amrozain 2007', 'comment': 'good for hackers'}, {'name': 'Kazuko Masao', 'comment': 'Very good 
.. Very good .. Very good.'}, {'name': 'chase Brigette', 'comment': 'This extention seems to be the culprit that makes bing my default browser!!! The extension was good before I realized this -_-"'}, {'name': 'Digital Audio Directions', 'comment': 'This is a joke right?  Only seems to list cookies of the site you are on and all in a chopped up list format.  NO search function for existing stored cookies?  Search by keyword, date, etc,  Does not seem available.'}, {'name': 'Phantom V', 'comment': 'This seems outdated.'}, {'name': 'Anonymous ZN49', 'comment': 'Easy to use this extension. it is very user friendly and simple interface, while other looks little complicated.'}, {'name': 'YongYi Wu', 'comment': "Who don't love cookies?"}, {'name': 'hush', 'comment': 'was working fine, now im getting an import error'}]
Advertisement