Using Decompose to remove empty tag

Question

I am trying to search for emails in HTML elements. I want to run the code so that when there are no emails found in the HTML, to search in another element in the HTML and in the end if it is not found to set email as "N/A". I am new to writing code and I am trying to

Accepted Answer

Rather than going iteratively class by class, why not go top to bottom across the whole HTML irrespective of the class, and if you find an EMAIL, just store the EMAIL along with the class of the element in a dictionary.And then you can find email from the dictionary based on which class you want to check first.EMAIL_REGEX = "[a-zA-Z0-9_.+-]+@[a-zA-Z0-9-]+.[a-zA-Z0-9-.]+"def applyRegex(element): if element.text: emailsFound = re.findall(EMAIL_REGEX, element.text) if emailsFound: return True return Falsefinal_dict = {}email_elements = soup.find_all(applyRegex)for element in email_elements: emailsFound = re.findall(EMAIL_REGEX, element.text) for email in emailsFound: if element.has_attr('class'): classname = element['class'] final_dict.update({classname: element.text})if final_dict: # do whatever you want to do with the dictionary of :else: print("N/A")

Advertisement

Answer