from bs4 import BeautifulSoup import requests def kijiji(): source = requests.get('https://www.kijiji.ca/b-mens-shoes/markham-york-region/c15117001l1700274').text soup = BeautifulSoup(source,'lxml') b = soup.find('div', class_='price') for link in soup.find_all('a',class_ = 'title'): a = link.get('href') fulllink = 'http://kijiji.ca'+a print(fulllink) b = soup.find('div', class_='price') print(b.prettify()) kijiji()
Usage of this is to sum up all the different kinds of items sold in kijiji and pair them up with a price. But I can’t seem to find anyway to increment what beautiful soup is finding with a class of price, and I’m stuck with the first price. Find_all doesn’t work either as it just prints out the whole blob instead of grouping it together with each item.
Advertisement
Answer
If you have Beautiful soup 4.7.1 or above you can use following css selector select()
which is much faster.
code:
import requests from bs4 import BeautifulSoup res=requests.get("https://www.kijiji.ca/b-mens-shoes/markham-york-region/c15117001l1700274").text soup=BeautifulSoup(res,'html.parser') for item in soup.select('.info-container'): fulllink = 'http://kijiji.ca' + item.find_next('a', class_='title')['href'] print(fulllink) price=item.select_one('.price').text.strip() print(price)
Or to use find_all()
use below code block
import requests from bs4 import BeautifulSoup res=requests.get("https://www.kijiji.ca/b-mens-shoes/markham-york-region/c15117001l1700274").text soup=BeautifulSoup(res,'html.parser') for item in soup.find_all('div',class_='info-container'): fulllink = 'http://kijiji.ca' + item.find_next('a', class_='title')['href'] print(fulllink) price=item.find_next(class_='price').text.strip() print(price)