Skip to content

Tag: html-content-extraction

Extract part of a regex match

I want a regular expression to extract the title from a HTML page. Currently I have this: Is there a regular expression to extract just the contents of <title> so I don’t have to remove the tags? Answer Use ( ) in regexp and group(1) in python to retrieve the captured string ( will return None if it doesn’t find
