pyXML beginner questions

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • Sebastian Fey

    pyXML beginner questions



    hi,

    id like to do the following:

    (1) open a .xml, change something and save it.
    problem is: how to save/serialize?
    i tried xml.dom.ext.Pri nt, but this resolves all entities and serializes
    the xml with resolved entities. (see example below)



    (2) id also like to load external parsed entities referenced in the xml.
    MSXML provides an extension(?) to DOM which returns the uri to an
    entityReference-NODE.
    any similar in pyXML. actually, is nodetype entityReference implemented
    in pyXML. i always get the nodeType of the resolved entity, ie 3
    (NODE_TEXT) with a internal unparsed entity.


    #############
    <?xml version="1.0" encoding="iso-8859-1"?>
    <!DOCTYPE xbel [
    <!ENTITY intTxt 'GIGI'>
    <!ENTITY intMarkup '<entIntern>tex t</entIntern>'>
    <!ENTITY extParsed SYSTEM "ent.xml">
    ]>
    <root>
    <text>&intTxt ;</text>
    &intMarkup;
    &extParsed;
    </root>
    ############

    becomes:

    #############
    <?xml version="1.0" encoding="iso-8859-1"?>
    <root>
    <text>some text</text>
    <entIntern>text </entIntern>'
    <entIntern>text </entIntern>'
    </root>
    #############



    thx,
    Sebastian




  • Uche Ogbuji

    #2
    Re: pyXML beginner questions

    Sebastian Fey <fey@parsytec.d e> wrote in message news:<c2soid$19 1t46$1@ID-190842.news.uni-berlin.de>...[color=blue]
    > hi,
    >
    > id like to do the following:
    >
    > (1) open a .xml, change something and save it.
    > problem is: how to save/serialize?
    > i tried xml.dom.ext.Pri nt, but this resolves all entities and serializes
    > the xml with resolved entities. (see example below)[/color]

    Sounds as if you want a lexical round-trip. Very few XML processing
    packages allow for this. I'd check whether pxdom supports this. If
    not, I don't expect you'll find it in Python.


    [color=blue]
    > (2) id also like to load external parsed entities referenced in the xml.
    > MSXML provides an extension(?) to DOM which returns the uri to an
    > entityReference-NODE.
    > any similar in pyXML. actually, is nodetype entityReference implemented
    > in pyXML. i always get the nodeType of the resolved entity, ie 3
    > (NODE_TEXT) with a internal unparsed entity.[/color]

    Again pxdom will get you closest.

    --Uche

    Comment

    • Andrew Clover

      #3
      Re: pyXML beginner questions

      Sebastian Fey <fey@parsytec.d e> wrote:
      [color=blue]
      > actually, is nodetype entityReference implemented in pyXML.[/color]

      Yes, but you won't ever see them from a parse operation.

      Print() will happily serialise an entity reference as &e; providing you
      can get one into the document in the first place. Using
      Document.create EntityReference () is the only way I know.
      [color=blue]
      > (2) id also like to load external parsed entities referenced in the xml.
      > MSXML provides an extension(?) to DOM which returns the uri to an
      > entityReference-NODE. any similar in pyXML.[/color]

      The standard DOM way of doing it is to use the DocumentType.en tities
      interface:

      doctype= entref.ownerDoc ument.doctype
      entdecl= doctype.entitie s.getNamedItem( entref.nodeName )
      uri= entdecl.systemI d # see also baseURI if using DOM Level 3 Core

      This isn't any use for 4DOM as you won't get any Entity objects from its
      parse stage and you can't create your own.

      In DOM Level 3 Load/Save, control of whether Entity and EntityReference
      objects are kept in the document is achieved with the DOMConfiguratio n
      parameter 'entities':

      parser= implementation. createLSParser( 1, None)
      parser.domConfi g.setParameter( 'entities', True) # False by default
      doc= parser.parseURI ('file:///in.xml')
      serialiser= implementation. createLSSeriali zer()
      serialiser.domC onfig.setParame ter('entities', True)
      serialiser.writ eToURI(doc, 'file:///out.xml')

      DOM 3 LS is still at Proposed Recommendation stage and isn't supported
      by 4DOM yet. (Insert customary pxdom plug here.)

      --
      Andrew Clover
      mailto:and@doxd esk.com

      Comment

      Working...