Skip to content
Advertisement

beautifulsoup Case Insensitive?

I was reading: Is it possible for BeautifulSoup to work in a case-insensitive manner?

But it’s not what I actually needed, I’m looking for all img tags in webpage, which include: IMG, Img etc...

This code:

images = soup.findAll('img')

Will only look for img tags case sensitive so how can I solve this problem without adding new line for every single possibility (and maybe forget to add some)?

Please Note that the above question isn’t about the tag but it’s properties.

Advertisement

Answer

BeautifulSoup is not case sensitiv per se just give it a try. If you miss some information in your result maybe there is another issue. You could force it to parse sensitiv while using xml parser if needed in some case.

Note: In newer code avoid old syntax findAll() instead use find_all() – For more take a minute to check docs

Example
from bs4 import BeautifulSoup
html = '''
<img src="" alt="lower">
<IMG src="" alt="upper">
<iMG src="" alt="mixed">
'''
soup = BeautifulSoup(html)

soup.find_all('img')
Output
[<img alt="lower" src=""/>,
 <img alt="upper" src=""/>,
 <img alt="mixed" src=""/>]
Advertisement