I'm using regular expressions to parse HTML hyperlinks and I've run into a problem. I'm trying to escape characters
such as '.' and '?' for use in regular expressions, but it's not working
The output is:
Why aren't the added slashes being interpretted as escape characters?
such as '.' and '?' for use in regular expressions, but it's not working
Code:
# Grabs a link. For this example, let's say that the string grabbed is '<a href="http://google.com/?q=foo">Click</a>'
link_url_original = GetLink()
# Sanitize string for regex use
link_url_original = re.sub("\.", "\.", link_url_original)
link_url_original = re.sub("\?", "\?", link_url_original)
toSub = 'http://google.com/?q=foo'
to Repl = 'http://www.yahoo.com'
final = re.sub(toSub, toRpl, link_url_original)
print final
Code:
<a href="http://google\.com/\?q=foo">Click</a>
Comment