I am doing web scraping from www.gsmarena.com . i want to extract a specific data according to the user input. this code returns all the phone models and names, i want to extract just the samsung phones details with a specific input suck as the RAM,ROM,CPU and COLOR. help me please. thanks in advance.
import requests from bs4 import BeautifulSoup def link_scan(link_url): c = 1 source_code=requests.get(link_url) plain_text=source_code.text soup = BeautifulSoup(plain_text) for link in soup.find_all('div',{'class':'brandmenu-v2 light l-box clearfix'}): for li in link.find_all('li'): for anc in li.find_all('a'): anc_src = r'http://www.gsmarena.com/' + anc.get('href') anc_name = anc.string print(c, anc_name,"n", anc_src, "n") c += 1 inside_scan(anc_name, anc_src) def inside_scan(name, hrefs): i = 1 source_code=requests.get(hrefs) plain_text=source_code.text soup = BeautifulSoup(plain_text) for link in soup.find_all('div',{'class':'makers'}): for li in link.find_all('li'): for anc in li.find_all('a'): for nam in (sp.find('span') for sp in anc.find_all('strong')): modal_name = nam.string print("t", i, "t", name, modal_name) i += 1 link_scan(r'http://www.gsmarena.com/')
Advertisement
Answer
I would advise you to play sometime with urls. In your case user may ask for a specific mobile phone manufacturer and target url would look like this:
https://www.gsmarena.com/samsung-phones-9.php
Moreover, you are quite lucky because you can fetch a certain cell phone details without redirecting to it’s page. In your case each cell phone refers to anchor tag with a class name like this:
<a href="samsung_galaxy_m31s-10333.php">
It means you can parse links that start from “Samsung” in order to filter the query according to the user’s needs:
https://www.gsmarena.com/samsung
To fetch CPU, RAM, e.t.c info you have to refer anchor tags:
<a href="samsung_galaxy_m31s-10333.php"><img src="https://fdn2.gsmarena.com/vv/bigpic/samsung-galaxy-m31s.jpg" title="Samsung Galaxy M31s Android smartphone. Announced Jul 2020. Features 6.5″ Super AMOLED display, Exynos 9611 chipset, 6000 mAh battery, 128 GB storage, 8 GB RAM, Corning Gorilla Glass 3."><strong><span>Galaxy M31s</span></strong></a>