package similar to XML::Simple

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • Paulo Pinto

    package similar to XML::Simple

    Hi,


    does anyone know of a Python package that
    is able to load XML like the XML::Simple
    Perl package does?

    For those that don't know it, this package
    maps the XML file to a dictionary.

    Of course I can build such a package myself
    but it would be better if it already exists :)

    --
    Paulo Pinto

  • Pierre N

    #2
    Re: package similar to XML::Simple

    I'm using pyRXP, and it's great.
    It's using one tuple, not dictionnaries.
    Very very fast.
    By the way I'm just starting using this package, anybody met any
    problems with pyRXP?

    -- Pierre


    On Wed, 2004-01-28 at 09:53, Paulo Pinto wrote:[color=blue]
    > Hi,
    >
    >
    > does anyone know of a Python package that
    > is able to load XML like the XML::Simple
    > Perl package does?
    >
    > For those that don't know it, this package
    > maps the XML file to a dictionary.
    >
    > Of course I can build such a package myself
    > but it would be better if it already exists :)
    >
    > --
    > Paulo Pinto[/color]


    Comment

    • Harald Massa

      #3
      Re: package similar to XML::Simple

      Paulo Pinto
      [color=blue]
      > does anyone know of a Python package that
      > is able to load XML like the XML::Simple
      > Perl package does?[/color]

      Good to ask! I know of at least 3 packages that do sth. similiar.

      - Fredrik Lundhs elementtree
      - D. Merzs gnosis xml utilities
      - handyxml

      just google for them.

      Comment

      • Peter Hansen

        #4
        Re: package similar to XML::Simple

        Paulo Pinto wrote:[color=blue]
        >
        > does anyone know of a Python package that
        > is able to load XML like the XML::Simple
        > Perl package does?
        >
        > For those that don't know it, this package
        > maps the XML file to a dictionary.[/color]

        A simple dictionary is insufficient to represent XML in general,
        so perhaps you're talking about a subset of XML, maybe with no
        attributes, and where the order of the child elements doesn't
        matter? Or something else?

        Or do you really mean something like a multiply-nested
        dictionary, perhaps with lists as well?
        [color=blue]
        > Of course I can build such a package myself
        > but it would be better if it already exists :)[/color]

        We were able to build something similar by stripping down
        Fredrik Lundh's elementtree until we had little more than the
        calls to the expat parser (i.e. we used his source as a tutorial
        on using expat :-), so if this is something like the XML-subset
        I mention above, you could do it in an hour or so from scratch
        if you knew Python well.

        -Peter

        Comment

        • Paulo Pinto

          #5
          Re: package similar to XML::Simple

          I mean multiple nested dictionaries with lists.

          But handyxml seems to solve my problem.

          Thanks, guys

          Peter Hansen wrote:[color=blue]
          > Paulo Pinto wrote:
          >[color=green]
          >>does anyone know of a Python package that
          >>is able to load XML like the XML::Simple
          >>Perl package does?
          >>
          >>For those that don't know it, this package
          >>maps the XML file to a dictionary.[/color]
          >
          >
          > A simple dictionary is insufficient to represent XML in general,
          > so perhaps you're talking about a subset of XML, maybe with no
          > attributes, and where the order of the child elements doesn't
          > matter? Or something else?
          >
          > Or do you really mean something like a multiply-nested
          > dictionary, perhaps with lists as well?
          >
          >[color=green]
          >>Of course I can build such a package myself
          >>but it would be better if it already exists :)[/color]
          >
          >
          > We were able to build something similar by stripping down
          > Fredrik Lundh's elementtree until we had little more than the
          > calls to the expat parser (i.e. we used his source as a tutorial
          > on using expat :-), so if this is something like the XML-subset
          > I mention above, you could do it in an hour or so from scratch
          > if you knew Python well.
          >
          > -Peter[/color]

          Comment

          • Uche Ogbuji

            #6
            Re: package similar to XML::Simple

            Pierre N <pierren@mac.co m> wrote in message news:<mailman.9 04.1075287732.1 2720.python-list@python.org >...[color=blue]
            > I'm using pyRXP, and it's great.
            > It's using one tuple, not dictionnaries.
            > Very very fast.
            > By the way I'm just starting using this package, anybody met any
            > problems with pyRXP?[/color]

            I did. It's not an XML parser :-(. It does not accept character
            entities such as … (the example that bit me), giving meaningless
            "error" messages along the lines: "not a valid 8-bit XML character".
            If you need an XML parser, use PyRXPU, which comes in ReportLab CVS
            only. It is not as fast as PyRXP, but conformant in my testing, and
            the point of XML is conformance, not speed at all costs. If you want
            speed at all costs, use CSV or some other plain text format.

            I'm writing at length about this unfortunate PyRXP situation in my
            next ORA python/XML column (expected Weds).

            --Uche

            Comment

            • Uche Ogbuji

              #7
              Re: package similar to XML::Simple

              Paulo Pinto <paulo.pinto@ce rn.ch> wrote in message news:<bv80qq$kb g$1@sunnews.cer n.ch>...[color=blue]
              > Hi,
              >
              >
              > does anyone know of a Python package that
              > is able to load XML like the XML::Simple
              > Perl package does?
              >
              > For those that don't know it, this package
              > maps the XML file to a dictionary.
              >
              > Of course I can build such a package myself
              > but it would be better if it already exists :)[/color]

              FWIW: http://www.xml.com/pub/a/2004/01/14/py-xml.html

              --Uche

              Comment

              • Peter Hansen

                #8
                Re: package similar to XML::Simple

                Uche Ogbuji wrote:[color=blue]
                >
                > Pierre N <pierren@mac.co m> wrote in message news:<mailman.9 04.1075287732.1 2720.python-list@python.org >...[color=green]
                > > I'm using pyRXP, and it's great.
                > > It's using one tuple, not dictionnaries.
                > > Very very fast.
                > > By the way I'm just starting using this package, anybody met any
                > > problems with pyRXP?[/color]
                >
                > I did. It's not an XML parser :-(. It does not accept character
                > entities such as … (the example that bit me), giving meaningless
                > "error" messages along the lines: "not a valid 8-bit XML character".
                > If you need an XML parser, use PyRXPU, which comes in ReportLab CVS
                > only. It is not as fast as PyRXP, but conformant in my testing, and
                > the point of XML is conformance, not speed at all costs. If you want
                > speed at all costs, use CSV or some other plain text format.[/color]

                Hmm... so it's your opinion that *all* XML parsers must handle *all*
                aspects of XML? If not, I think you should back off on the criticism
                of PyRXP as being "not an XML parser" and simply point out that it
                doesn't handle all aspects of XML because it is intended to provide
                a very fast/heavily optimized approach to parsing only certain kinds
                of XML. It's a valid choice to do so, though of course if PyRXP is
                promoted as a "full" XML solution that might be inaccurate.

                -Peter

                Comment

                • Martin v. Löwis

                  #9
                  Re: package similar to XML::Simple

                  Peter Hansen wrote:[color=blue]
                  > Hmm... so it's your opinion that *all* XML parsers must handle *all*
                  > aspects of XML? If not, I think you should back off on the criticism
                  > of PyRXP as being "not an XML parser" and simply point out that it
                  > doesn't handle all aspects of XML because it is intended to provide
                  > a very fast/heavily optimized approach to parsing only certain kinds
                  > of XML.[/color]

                  I am not Uche, but I think that all XML parsers should conform to the
                  XML recommendation (and treat deviations from the XML recommendation
                  as bugs).

                  This is not the same as handling all aspects of XML, since the XML
                  recommendation makes certain aspects optional. Processing character
                  references is not one of them (but e.g. validation is).
                  [color=blue]
                  > It's a valid choice to do so, though of course if PyRXP is
                  > promoted as a "full" XML solution that might be inaccurate.[/color]

                  Packages may help processing only selected XML documents, and they
                  may also support documents which are not XML. However, in neither
                  case, they should call themselves "XML parsers". "XML-like parsers"
                  or "XML subset parsers" might be more appriate.

                  Regards,
                  Martin

                  Comment

                  • Uche Ogbuji

                    #10
                    An XML parser is an XML parser. Period.

                    Peter Hansen <peter@engcorp. com> wrote in message news:<40290854. 15BB5CF0@engcor p.com>...[color=blue]
                    > Uche Ogbuji wrote:[color=green]
                    > >
                    > > Pierre N <pierren@mac.co m> wrote in message news:<mailman.9 04.1075287732.1 2720.python-list@python.org >...[color=darkred]
                    > > > I'm using pyRXP, and it's great.
                    > > > It's using one tuple, not dictionnaries.
                    > > > Very very fast.
                    > > > By the way I'm just starting using this package, anybody met any
                    > > > problems with pyRXP?[/color]
                    > >
                    > > I did. It's not an XML parser :-(. It does not accept character
                    > > entities such as … (the example that bit me), giving meaningless
                    > > "error" messages along the lines: "not a valid 8-bit XML character".
                    > > If you need an XML parser, use PyRXPU, which comes in ReportLab CVS
                    > > only. It is not as fast as PyRXP, but conformant in my testing, and
                    > > the point of XML is conformance, not speed at all costs. If you want
                    > > speed at all costs, use CSV or some other plain text format.[/color]
                    >
                    > Hmm... so it's your opinion that *all* XML parsers must handle *all*
                    > aspects of XML?[/color]

                    XML is clear on what a Parser *must* support. The full character
                    production is one of those things. From XML 1.0, section 2.2:

                    Character Range
                    [2] Char ::= #x9 | #xA | #xD | [#x20-#xD7FF] | [#xE000-#xFFFD] |
                    [#x10000-#x10FFFF]

                    There is no "option" to not support characters greater than #xFF. XML
                    parsers *can* leave off handling some aspects of XML, external DTD
                    subsets, for example, but you can not be as fundamentally
                    non-conformant as PyRXP and still call yourself an XML parser.

                    This is not just an academic matter. There are a *vast* number of
                    useful and heavily-used characters of code point higher than U+FF and
                    if parsers decided on a whim to pick and choose what to support the
                    result would be complete and utter chaos.

                    [color=blue]
                    > If not, I think you should back off on the criticism
                    > of PyRXP as being "not an XML parser" and simply point out that it
                    > doesn't handle all aspects of XML because it is intended to provide
                    > a very fast/heavily optimized approach to parsing only certain kinds
                    > of XML. It's a valid choice to do so, though of course if PyRXP is
                    > promoted as a "full" XML solution that might be inaccurate.[/color]

                    PyRXP is not an XML parser. It's that simple. I stand by that veru
                    strong satement, and I'd be surprised if XML expert refusaes to
                    corroborate it.

                    I do want to point out that PyRXPU does seem to be a proper XML
                    parser, and is what people should use instead if they like the
                    ReportLab products.

                    Of course if yu don't really need an XML parser, feel free to use
                    PyRXP. Just don't call it what it isn't.

                    --Uche
                    Igbo-American immigrant from Nigeria, settled near Boulder, Colorado with my wife, three sons and daughter. Restless mind in a restless body, I do a million things without getting very much truly done

                    Comment

                    • Uche Ogbuji

                      #11
                      Re: package similar to XML::Simple

                      "Martin v. Löwis" <martin@v.loewi s.de> wrote in message news:<c0bc8v$ib u$01$1@news.t-online.com>...[color=blue]
                      > Peter Hansen wrote:[color=green]
                      > > Hmm... so it's your opinion that *all* XML parsers must handle *all*
                      > > aspects of XML? If not, I think you should back off on the criticism
                      > > of PyRXP as being "not an XML parser" and simply point out that it
                      > > doesn't handle all aspects of XML because it is intended to provide
                      > > a very fast/heavily optimized approach to parsing only certain kinds
                      > > of XML.[/color]
                      >
                      > I am not Uche, but I think that all XML parsers should conform to the
                      > XML recommendation (and treat deviations from the XML recommendation
                      > as bugs).
                      >
                      > This is not the same as handling all aspects of XML, since the XML
                      > recommendation makes certain aspects optional. Processing character
                      > references is not one of them (but e.g. validation is).
                      >[color=green]
                      > > It's a valid choice to do so, though of course if PyRXP is
                      > > promoted as a "full" XML solution that might be inaccurate.[/color]
                      >
                      > Packages may help processing only selected XML documents, and they
                      > may also support documents which are not XML. However, in neither
                      > case, they should call themselves "XML parsers". "XML-like parsers"
                      > or "XML subset parsers" might be more appriate.[/color]

                      I wouldn't argue with calling PyRXP an "XML-like parser".

                      Because until very recently I thought that PyRXP was an XML parser, I
                      was extremely taken aback when I ran afoul of PyRXP's brazen character
                      non-conformance. As an example of the danger in this non-conformance,
                      PyRXP refused to parse the very first well-formed XML document I gave
                      it. And I'm (mostly) a native English speaker. True XML parsers
                      strive for interoperabilit y for a reason. Not doing so pretty much
                      negates the value of XML.

                      I was even more taken aback to read that the PyRXP developers refused
                      to make the simple fix needed for conformance. I think it is
                      essential to point out that a tool that refuses XML conformance cannot
                      go about calling itself an XML parser.


                      --Uche
                      Igbo-American immigrant from Nigeria, settled near Boulder, Colorado with my wife, three sons and daughter. Restless mind in a restless body, I do a million things without getting very much truly done

                      Comment

                      • Peter Hansen

                        #12
                        Re: package similar to XML::Simple

                        Uche Ogbuji wrote:[color=blue]
                        >
                        > I was even more taken aback to read that the PyRXP developers refused
                        > to make the simple fix needed for conformance.[/color]

                        This is a very relevant data point that was missing in the discussion
                        until now.

                        Given that situation, I'd agree that labelling PyRXP simply an "XML parser"
                        without qualification is misleading and wrong.

                        -Peter

                        Comment

                        • Chris Herborth

                          #13
                          Re: package similar to XML::Simple

                          Paulo Pinto wrote:
                          [color=blue]
                          > does anyone know of a Python package that
                          > is able to load XML like the XML::Simple
                          > Perl package does?[/color]

                          Despite all of the, uh, _discussion_ in this thread, I'd like to thank you
                          folks for pointing out pyRXP... I hadn't found that before, and if I can
                          whip up a pyRXP -> DOM2 translator, it will fit my needs _perfectly_.

                          Thanks!

                          --
                          Chris Herborth chrish@cryptoca rd.com
                          Documentation Overlord, CRYPTOCard Corp. http://www.cryptocard.com/
                          Never send a monster to do the work of an evil scientist.

                          Comment

                          • Paul Boddie

                            #14
                            Re: package similar to XML::Simple

                            Chris Herborth <chrish@cryptoc ard.com> wrote in message news:<Wj3Yb.391 4$Cd6.174692@ne ws20.bellglobal .com>...[color=blue]
                            >
                            > Despite all of the, uh, _discussion_ in this thread, I'd like to thank you
                            > folks for pointing out pyRXP... I hadn't found that before, and if I can
                            > whip up a pyRXP -> DOM2 translator, it will fit my needs _perfectly_.[/color]

                            Well, if it is true what people claim about dictionaries and tuples
                            being faster than objects, then you may see any supposed performance
                            advantage claimed by the PyRXP proponents just dissolve away as you
                            instantiate all those nodes. But as I noted with respect to "double
                            wrapping" libxml2, if you can restrict yourself to very few high-level
                            operations through those layers, and then invoke various "native"
                            methods directly, then it could still be worth it.

                            Paul

                            Comment

                            Working...