Add encoding to XML element

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • JohnDoh
    New Member
    • Nov 2006
    • 1

    Add encoding to XML element

    Hello

    I've been working on a script that generates a XML document, and output needs to have encoding "ISO-8859-1" defined in the first <?xml ...> tag.

    I found this example on http://infohost.nmt.ed u/tcc/help/pubs/pyxml/creating.html

    Code:
    import xml.dom.minidom
    #import xml.dom.ext as domExt
    
    dom = xml.dom.minidom.getDOMImplementation()
    
    doctype = dom.createDocumentType("html",
                  "-//W3C//DTD XHTML 1.0 Strict//EN",
                  "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd" )
    
    doc = dom.createDocument( None, "html", doctype )
    (... snip ...)
    This should produce something like
    Code:
    <?xml version='1.0' encoding='UTF-8'?>
    <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
      "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
    <html/>
    However, when I try it, the encoding part is not generated. Also, AFAIK "UTF-8" is the default coding and I require something else...

    So how do you set what encoding to use?
  • bartonc
    Recognized Expert Expert
    • Sep 2006
    • 6478

    #2
    I don't have any experience, but section 8 of the 2.5 docs have some interesting looking tidbits like:

    8.4 htmlentitydefs -- Definitions of HTML general entities

    This module defines three dictionaries, name2codepoint, codepoint2name, and entitydefs. entitydefs is used by the htmllib module to provide the entitydefs member of the HTMLParser class. The definition provided here contains all the entities defined by XHTML 1.0 that can be handled using simple textual substitution in the Latin-1 character set (ISO-8859-1)

    entitydefs A dictionary mapping XHTML 1.0 entity definitions to their replacement text in ISO Latin-1.

    Comment

    Working...