Hey,
I have a problem with character encoding in LXML. Here's how it goes:
I read an HTML document from a third-party site. It is supposed to be
in UTF-8, but unfortunately from time to time it's not. I parse the
document like this:
html_doc = HTML(string_wit h_document)
Then I retrieve some info from the document with XPath:
xpath_nodes = html_doc('/html/body/something')...