Skip to content
Advertisement

Tag: beautifulsoup

Python – Extract string from website with Beautifulsoup

I would like to extract a string from a HTML source with only beautifulsoup. I am trying to extract: “1 van de maximaal 3 actieve reacties” from the following HTML: My current code retrieves the entire span class, but I cannot find out how I can only extract the string, without the use of .split or some sort of string

How to select all tags HTML

From this webpage I need to select all tags <b> </b> with BeautifulSoup4. I have tried using find_all() and select() but they fail to show all <b> tags when used in the array Answer There are different parsers used in parsing a html document, the most used one is ‘html.parser’. I have used lxml here which uses both xml and

How to only scrape link from webpage – Python

My goal is to get each link My code prints the href/link, however it also prints other junk which i do not want. I only want the href/ Answer Because href=True means get those tags with href attribute.There are still Tag. To get the href, you also need to use .get(“href”).Since there is only one button in each session tag,

BeautifulSoup returns empty list with valid html content

I’m trying to build a webscraper for a hungarian e-commerce site called https://www.arukereso.hu. The problem is that when the nextpage() function is first called, it returns a valid link (https://www.arukereso.hu/notebook-c3100/?start=25), the request’s content is also valid html, but BeautifulSoup makes an empty list out of it, therefore the program ends with an error. I would be grateful, if someone could

Advertisement