I am trying to run this code to scrape reviews from the google play store – but I keep getting the following error:
DevTools listening on ws://127.0.0.1:53044/devtools/browser/9de3e58b-6384-4809-bf01-31d47a57879f Traceback (most recent call last): File "c:/Users/Emil/Documents/Guatrain_Reviews/guatrain_reviews.py", line 20, in <module> Ptitle = driver.find_element_by_class_name('id-app-title').text.replace(' ','') File "C:UsersEmilMiniconda3envsdata_analysislibsite-packagesseleniumwebdriverremotewebdriver.py", line 564, in find_element_by_class_name return self.find_element(by=By.CLASS_NAME, value=name) File "C:UsersEmilMiniconda3envsdata_analysislibsite-packagesseleniumwebdriverremotewebdriver.py", line 978, in find_element 'value': value})['value'] File "C:UsersEmilMiniconda3envsdata_analysislibsite-packagesseleniumwebdriverremotewebdriver.py", line 321, in execute self.error_handler.check_response(response) File "C:UsersEmilMiniconda3envsdata_analysislibsite-packagesseleniumwebdriverremoteerrorhandler.py", line 242, in check_response raise exception_class(message, screen, stacktrace) selenium.common.exceptions.NoSuchElementException: Message: no such element: Unable to locate element: {"method":"class name","selector":"id-app-title"} (Session info: chrome=71.0.3578.98) (Driver info: chromedriver=2.46.628402 (536cd7adbad73a3783fdc2cab92ab2ba7ec361e1),platform=Windows NT 10.0.17134 x86_64)
I suspect it has something to do with the id-app-title in
Ptitle = driver.find_element_by_class_name('id-app-title').text.replace(' ','')
Could someone point out where I would find that Id for the app I am interested in OR help me identify where I am going wrong.
Thanks
EDIT
The final result I want needs to look something like this:
where for which ever app url I insert – it will extract the rating and reviews:
Thanks
Advertisement
Answer
That code is from 2016, so I’m assuming they changed the structure which is why there is no ‘id-app-title’ or anything from the original code. That’s just my assumption.
There’s a lot of work that still needs to be done with this code (like changing out the time.sleep for implicit waits by selenium, and quite frankly just to make it more robust, as I only was looking at this particular app review.EDIT SEE BELOW) It’s really complex html with tons of nested div
and span
tags with no specific meaning associated with the attributes/ class, etc. So I had trouble pulling out each user review element.
But essentially, I was able to open the page with the browser, have it continue to scroll down until it can click “Show More”, and just continue an x amount of times.
Once it does that, it iterates the span tags. Now I figured out every 10 span tags is relating to a single user. However if the app owner responds to a review, it offsets then by 2 so had to account for that.
I’m fairly newer to programming, so I apologize for messy code and inefficiency. I’m sure an expert would be able to provide a better solution, however, this can hopefully get you started or playing around:
#load webdriver function from selenium from selenium import webdriver from time import sleep import bs4 import pandas as pd import requests from selenium.webdriver.common.keys import Keys import time # Change this number to get more or less reviews # Current set of x=100 yielded 11,312 reviews x = 100 link = "https://play.google.com/store/apps/details?id=uk.co.o2.android.myo2&hl=en_GB" driver = webdriver.Chrome('C:/chromedriver_win32/chromedriver.exe') driver.get(link + '&showAllReviews=true') num_clicks = 0 num_scrolls = 0 while num_clicks <= x and num_scrolls <= x*5: try: show_more = driver.find_element_by_xpath('//*[@id="fcxH9b"]/div[4]/c-wiz/div/div[2]/div/div[1]/div/div/div[1]/div[2]/div[2]/div/content/span') show_more.click() num_clicks += 1 except: html = driver.find_element_by_tag_name('html') html.send_keys(Keys.END) num_scrolls +=1 time.sleep(2) soup = bs4.BeautifulSoup(driver.page_source, 'html.parser') h2 = soup.find_all('h2') results_df = pd.DataFrame() for ele in h2: if ele.text == 'Reviews': c_wiz = ele.parent.parent.find_all('c-wiz') for sibling in c_wiz[0].next_siblings: try: #print (sibling) comment_shift = 0 spans = sibling.find_all('span') for user_block in range(0,len(spans)): i = user_block *10 name = spans[i+0+comment_shift].text try: rating = spans[i+1+comment_shift].div.next_element['aria-label'] rating = str(''.join(filter(str.isdigit, rating))) except: comment_shift += 2 continue date = spans[i+2+comment_shift].text review = spans[i+8+comment_shift].text print ('Name: %snRating: %snDate: %snReview: %sn' %(name, rating, date, review)) temp_df = pd.DataFrame([[date, rating, name, review]], columns = ['Date','Rating','User','Review']) results_df = results_df.append(temp_df) except: continue results_df = results_df.reset_index(drop=True) results_df.to_csv('C:/reviews.csv', index=False) driver.close()
Output:
print (results_df) Date ... Review 0 31 January 2019 ... Was broken for pay as you go customers. Has no... 1 2 February 2019 ... o2 just won't be happy until their customer se... 2 1 February 2019 ... Excellent quality piece of kit 3 6 February 2019 ... Gud π 4 23 December 2018 ... Can't get into the app using correct log in de... 5 16 December 2018 ... The update is rubbish. I can't use MyO2 anymor... 6 6 December 2018 ... Stop logging me out with every update, they ad... 7 25 December 2018 ... cant use this app anymore. shame i use to use ... 8 16 December 2018 ... Started receiving texts from 02 immediately af... 9 10 January 2019 ... havent been with the network long nor have i u... 10 22 December 2018 ... update has killed this app. why do I have to p... 11 9 January 2019 ... This app is now unusable for pay as you go cus... 12 26 January 2019 ... Wouldn't it be nice to find an app that the de... 13 19 December 2018 ... wont let me log in now since the latest update... 14 13 January 2019 ... it was ok for a while wen u needed to put in y... 15 6 January 2019 ... from last update I can't login anymore. not ev... 16 24 January 2019 ... I'm having 2 change review again coz I can't g... 17 5 January 2019 ... Changed my rating for this down from five to o... 18 22 December 2018 ... no longer works for me. shame as it was useful... 19 31 January 2019 ... total waste of time since update. not able to ... 20 23 January 2019 ... Despite what the description states the curren... 21 24 December 2018 ... When it finally lets you log in it then says t... 22 17 January 2019 ... Update breaks it, can't log in, log in on webs... 23 5 January 2019 ... 02 what have you done to app cant log in chang... 24 30 November 2018 ... Simple easy to use and all info available of m... 25 30 November 2018 ... No longer works for pay and go customers so co... 26 8 December 2018 ... Will not log me in after downloading the lates... 27 15 January 2019 ... Unable to log on to the app since the update. ... 28 1 January 2019 ... Very easy to use. Keeps me up to date. 29 1 December 2018 ... Good app maybe it should be as colourful as th... ... ... ... 11282 12 February 2017 ... Just re installed this a on my new device. Ha... 11283 18 December 2016 ... Since updating this app on my Samsung S3 mini ... 11284 19 January 2017 ... Lately the app gives intermittent server error... 11285 7 December 2016 ... New update 11286 12 December 2016 ... O2 needs to put right fast 11287 12 February 2017 ... Although unlimited minutes/texts I would still... 11288 30 December 2016 ... Never works 11289 13 August 2017 ... I have a Samsung galaxy 7 and the o2 app is no... 11290 6 December 2016 ... Doesn't work anymore 11291 4 December 2016 ... Since the last update this app does not work f... 11292 3 December 2016 ... O2 11293 5 December 2016 ... Good app (when it opens) 11294 11 January 2017 ... Stopped working and when it does work... 11295 1 December 2016 ... Nothing but a blue screen. Not happy. 11296 2 December 2016 ... Worst app ever 11297 18 January 2017 ... It's easier than trying to keep track of my ac... 11298 16 February 2017 ... The new update only shows blue screen before t... 11299 15 January 2017 ... Mr Dimitrov 11300 8 February 2017 ... Code 4 error frequently 11301 4 January 2017 ... Won't work at all 11302 27 January 2017 ... O2 GURU , EXCELLENT, ESQISET , PHANOMAL, SE... 11303 15 February 2017 ... Works well enough. 11304 1 December 2016 ... Great app keeps you up to.date 11305 28 December 2016 ... My 02 11306 16 December 2016 ... This is a "APPY APP"" 11307 22 November 2016 ... Doesn't work for business account. Only shows ... 11308 25 November 2016 ... Doesn't work anymore 11309 11 November 2016 ... The ap won't open its just a blue screen I've ... 11310 24 November 2016 ... Doesn't work 11311 12 November 2016 ... My 02 [11312 rows x 4 columns]
Edit:
I tried with a couple different links:
link = "https://play.google.com/store/apps/details?id=com.outfit7.mytalkingtom2" link = "https://play.google.com/store/apps/details?id=com.ingeniooz.hercule"
and it appeared to work:
Output:
print (results_df) Date ... Review 0 February 5, 2019 ... after update it is not workin before it was ev... 1 February 4, 2019 ... no word to describe simply π 2 February 6, 2019 ... I loved this game 3 February 6, 2019 ... it is very funny game and very nice game also 4 February 6, 2019 ... π 5 February 6, 2019 ... relaxing effect 6 February 6, 2019 ... this is a cool game 7 February 6, 2019 ... Good game 8 February 6, 2019 ... Beast 9 February 1, 2019 ... Love this game, it is so much better then the ... 10 February 1, 2019 ... The recent updates are epic. The blender and d... 11 February 1, 2019 ... i like this funny game because tom is jumping ... 12 February 2, 2019 ... tom 2 is a great game 13 February 3, 2019 ... Very very nice game 14 February 3, 2019 ... I like it very much 15 February 5, 2019 ... Nice and superb game. 16 February 2, 2019 ... Tom is a cutipie 17 February 2, 2019 ... it is so...... cute 18 February 2, 2019 ... tr ty0 19 February 2, 2019 ... so good 20 February 2, 2019 ... nice game 21 February 1, 2019 ... Nice game 22 February 3, 2019 ... i love this game 23 February 6, 2019 ... l love this game as it is fun and enjoyable to... 24 February 2, 2019 ... love it 25 February 5, 2019 ... it is so awesome πππ 26 February 2, 2019 ... Amazing 27 February 3, 2019 ... nice 28 February 6, 2019 ... good 29 January 30, 2019 ... Anish Biswa 3 to be a bit. I'm not a good idea... ... ... ... 1770 February 2, 2019 ... fun 1771 February 5, 2019 ... ect, 1772 February 6, 2019 ... tom. is so cute 1773 February 6, 2019 ... nice 1774 January 5, 2019 ... urguuhtr 1775 January 14, 2019 ... Very interesting game πππ 1776 January 10, 2019 ... It s very very very nice 1777 January 21, 2019 ... supab gameππππ 1778 January 16, 2019 ... it's too funny πΉπΉπΉπ°π°π° 1779 January 20, 2019 ... wow Best game 1780 January 27, 2019 ... It's damn good 1781 January 28, 2019 ... this a good and supper game. very nice game. ,... 1782 February 4, 2019 ... i love this game very very very much 1783 January 5, 2019 ... super 1784 January 12, 2019 ... It's fun Lol 1785 January 16, 2019 ... it ,s so good 1786 January 23, 2019 ... fun game for kids....loved it 1787 January 27, 2019 ... It's so nice 1788 February 1, 2019 ... Nice The Baby games i like ππππ 1789 January 29, 2019 ... it's funny and it's fun to play 1790 January 10, 2019 ... best game... so cute 1791 January 10, 2019 ... So Cute! 1792 January 24, 2019 ... i lv this game very nice game ..... 1793 January 25, 2019 ... Its superb... I love this game... π 1794 January 27, 2019 ... It is best game ever playedππππππ 1795 January 19, 2019 ... I love it! 1796 January 20, 2019 ... good game! 1797 January 16, 2019 ... i love this game π. 1798 January 25, 2019 ... It is a good game for kids..... 1799 January 31, 2019 ... my talking tom is funπππ [1800 rows x 4 columns]
And
print (results_df) Date ... Review 0 December 2, 2018 ... It's a very well-thought-out an all rounded ap... 1 January 1, 2019 ... L'application est superbe et hyper complète! B... 2 December 6, 2017 ... Great workout diary with statistics. Easy to u... 3 June 13, 2017 ... I love this app! I've tried so many others, bu... 4 March 28, 2017 ... Works great at what it does. You can add exerc... 5 March 21, 2017 ... Great 6 December 8, 2016 ... Has all I need to build & adjust my workouts 7 October 23, 2016 ... Goodish 8 September 23, 2016 ... Great app 9 July 18, 2016 ... Excellent 10 March 9, 2016 ... great app. 11 July 10, 2015 ... Amazing and easy to use 12 June 5, 2015 ... I dreamt of this app, Hercule made it. Best ap... 13 March 18, 2015 ... Really good, but... [14 rows x 4 columns]