Web scraping from the span element

Question

I am on a scraping project and I am lookin to scrape from the following. I want to extract only Christian, Islam as the output.(Without the &#8216;Faith:&#8217;). This is my try: How can I make this done? Answer There are several ways you can fix this, I would suggest the following &#8211; Find all <span&g…

Accepted Answer

There are several ways you can fix this, I would suggest the following – Find all in

that have not the class="h5":soup.select('div.spec-subcat.attributes-religion span:not(.h5)')Exampleimport requestshtml_text = '''

Faith:ChristianIslam

'''soup = BeautifulSoup(html_text, 'lxml')', '.join([x.get_text() for x in soup.select('div.spec-subcat.attributes-religion span:not(.h5)')])OutputChristian, Islam

Advertisement

Answer

Example

Output