Beautiful Soup 4: Remove comment tag and its content

Question

The page that I'm scraping contains these HTML codes. How do I remove the comment tag  along with its content with bs4? Answer You can use extract() (solution is based on this answer): PageElement.extract() removes a tag or string from the tree. It returns the tag or string that was extracted. As a result you get your div

Accepted Answer

You can use extract() (solution is based on this answer): PageElement.extract() removes a tag or string from the tree. It returns the tag or string that was extracted.from bs4 import BeautifulSoup, Commentdata = """

cat dog sheep goat

"""soup = BeautifulSoup(data)div = soup.find('div', class_='foo')for element in div(text=lambda text: isinstance(text, Comment)): element.extract()print soup.prettify()As a result you get your div without comments:

cat dog sheep goat

Advertisement

Answer