Scraping specific ‘dd’ tags with BeautifulSoup and Python

Question

Im learning beautifulsoup and I came a cross one problem. Thats scraping dd tags in html. Check out the picture below, I want to get the parameters that are in the red color zone. The problem is I do not know how to access them. I have tried this: But the problem is that sometimes different pages have different parameters,

Accepted Answer

In such cases, this is something you might wanna do instead of using index as the latter may lead you to the wrong dd. When you go for the following approach, all you need to do is replace the text within :contains('') to get their dd, as in Transakcija,Vrsta stana and so on..import requestsfrom bs4 import BeautifulSoupurl = "https://www.nekretnine.rs/stambeni-objekti/stanovi/zemun-krajiska-41m-bela-fasadna-cila-odlican/NkiRX4sq4Cy/"res = requests.get(url)soup = BeautifulSoup(res.text,"lxml")Kategorija = soup.select_one(".base-inf .dl-horozontal:has(:contains('Kategorija:')) > dd")Kategorija = Kategorija.get_text(strip=True) if Kategorija else ""print(Kategorija)

Advertisement

Answer