Python and HTMLParser.handle_data() – How to get data from tags?

Question

I&#8217;m trying to parse a web page with the Python HTMLParser. I want to get the content of a tag, but I&#8217;m not sure how to do it. This is the code I have so far: If I understand correctly, I can use the handle_data() function to get the data between tags. How do I specify which tags to get

Accepted Answer

html_code = urllib2.urlopen("xxx")html_code_list = html_code.readlines()data = ""for line in html_code_list:    line = line.strip()    if line.startswith("<h2"):       data = data+linehp = MyHTMLParser()hp.feed(data)hp.close()thus you can extract data from h2 tag, hope it can help

Advertisement

Answer