Search Result

Nov 3 '08, 11:45 AM

Hey,

I have a problem with character encoding in LXML. Here's how it goes:

I read an HTML document from a third-party site. It is supposed to be
in UTF-8, but unfortunately from time to time it's not. I parse the
document like this:

html_doc = HTML(string_wit h_document)

Then I retrieve some info from the document with XPath:

xpath_nodes = html_doc('/html/body/something')...

Search Result

encoding in lxml