SAX XML Parse Python error message

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • goldtech

    SAX XML Parse Python error message

    SAX XML Parse Python error message
    Hi,
    My first attempt at SAX, but have an error message I need help with.

    I cite the error message, code, and xml below.

    Be grateful if anyone can tell me what the fix is.
    Thanks.

    >>>
    Traceback (most recent call last):
    File "C:\Python24\Li b\site-packages\python win\pywin\frame work
    \scriptutils.py ", line 310, in RunScript
    exec codeObject in __main__.__dict __
    File "C:\pythonscrip ts\xml\parse3.p y", line 43, in ?
    parser.parse(r' C:\perlscripts\ xml\Document2.k ml')
    File "C:\Python24\li b\xml\sax\expat reader.py", line 107, in parse
    xmlreader.Incre mentalParser.pa rse(self, source)
    File "C:\Python24\li b\xml\sax\xmlre ader.py", line 123, in parse
    self.feed(buffe r)
    File "C:\Python24\li b\xml\sax\expat reader.py", line 207, in feed
    self._parser.Pa rse(data, isFinal)
    File "C:\Python24\li b\xml\sax\expat reader.py", line 303, in
    end_element
    self._cont_hand ler.endElement( name)
    File "C:\pythonscrip ts\xml\parse3.p y", line 39, in endElement
    print self.descriptio n, str(self.coordi nates)
    AttributeError: G_Handler instance has no attribute 'coordinates'
    >>>

    Code:

    from xml.sax import make_parser
    from xml.sax.handler import ContentHandler
    import string

    class G_Handler(Conte ntHandler):

    def __init__ (self):
    self.isFolderEl ement = 0
    self.isdescript ionElement = 0
    self.iscoordina tesElement = 0

    def startElement(se lf, name , attrs):
    if name == 'Folder':
    self.isFolderEl ement= 1
    self.Folder = ""
    if name == 'description':
    self.isdescript ionElement= 1
    self.descriptio n = ""
    if name == 'coordinates':
    self.iscoordina tesElement = 1
    self.coordinate s = ""


    def characters (self, ch):
    if self.isFolderEl ement == 1:
    self.Folder = ch
    if self.isdescript ionElement == 1:
    self.descriptio n = ch
    if self.iscoordina tesElement == 1:
    self.coordinate s = ch

    def endElement(self , name):
    if name == 'Folder':
    self.isFolderEl ement = 0
    if name == 'description':
    self.isdescript ionElement= 0
    if name == 'coordinates':
    self.iscoordina tesElement = 0
    print self.descriptio n, str(self.coordi nates)

    parser = make_parser()
    parser.setConte ntHandler(G_Han dler())
    parser.parse(r' C:\perlscripts\ xml\Document2.k ml')



    <?xml version="1.0" encoding="UTF-8"?>
    <Folder>
    <description>
    abc
    </description>
    <coordinates>
    -84.4, 33.7
    </coordinates>
    <description>
    abc
    </description>
    <coordinates>
    -86.7, 36.1
    </coordinates>
    </Folder>
  • Stefan Behnel

    #2
    Re: SAX XML Parse Python error message

    goldtech wrote:
    My first attempt at SAX, but have an error message I need help with.
    Just in case you prefer writing readable code over debugging SAX code into
    existence, try lxml.



    Here is a presentation you might find interesting.



    Stefan

    Comment

    • goldtech

      #3
      Re: SAX XML Parse Python error message

      I would be grateful for support with the code I cited. It's not long
      and fairly standard. I'm sure my error(s) would be glaring to more
      experienced coders. I appreciated the "heads-up" about other options
      but I would be grateful for help getting this code to run. Thanks



      On Jul 13, 11:47 am, Stefan Behnel <stefan...@behn el.dewrote:
      goldtech wrote:
      My first attempt at SAX, but have an error message I need help with.
      >
      Just in case you prefer writing readable code over debugging SAX code into
      existence, try lxml.
      >

      >
      Here is a presentation you might find interesting.
      >

      >
      Stefan

      Comment

      • Waldemar Osuch

        #4
        Re: SAX XML Parse Python error message

        On Jul 13, 3:00 pm, goldtech <goldt...@world post.comwrote:
        I would be grateful for support with the code I cited. It's not long
        and fairly standard. I'm sure my error(s) would be glaring to more
        experienced coders. I appreciated the "heads-up" about other options
        but I would be grateful for help getting this code to run. Thanks
        Initialize self.coodinates in the __init__
        or indent the "print self.descriptio n, str(self.coordi nates)"
        one more level.
        You have to remember that "endElement " is being called on the end
        of every element. In your case it is called by </descriptionbut
        the parser did not see <coordinatesyet .

        In "def characters" you should be collecting the "ch" in a buffer.
        It may be called multiple times for the same element.
        Something like "self.descripti on += ch" would do for starters.

        Also you do not need to convert self.coordinate s to string before
        printing, it is already a string and even if it was not "print"
        would convert it for you.

        That's it for now :-) Others may spot more issues with
        your code or my response.
        On the positive side I really liked how you asked
        the question. There was a short runnable example and traceback.

        Waldemar

        Comment

        • goldtech

          #5
          Re: SAX XML Parse Python error message

          On Jul 13, 5:30 pm, Waldemar Osuch <waldemar.os... @gmail.comwrote :
          On Jul 13, 3:00 pm, goldtech <goldt...@world post.comwrote:
          >
          I would be grateful for support with the code I cited. It's not long
          and fairly standard. I'm sure my error(s) would be glaring to more
          experienced coders. I appreciated the "heads-up" about other options
          but I would be grateful for help getting this code to run. Thanks
          >
          Initialize self.coodinates in the __init__
          or indent the "print self.descriptio n, str(self.coordi nates)"
          one more level.
          You have to remember that "endElement " is being called on the end
          of every element. In your case it is called by </descriptionbut
          the parser did not see <coordinatesyet .
          >
          In "def characters" you should be collecting the "ch" in a buffer.
          It may be called multiple times for the same element.
          Something like "self.descripti on += ch" would do for starters.
          >
          Also you do not need to convert self.coordinate s to string before
          printing, it is already a string and even if it was not "print"
          would convert it for you.
          >
          That's it for now :-) Others may spot more issues with
          your code or my response.
          On the positive side I really liked how you asked
          the question. There was a short runnable example and traceback.
          >
          Waldemar
          Putting the print statements were they won't cause trouble and
          using ...+= ch (vs. only =) in the character section fixed it:

          ....
          def endElement(self , name):
          ....
          if name == 'description':
          self.isdescript ionElement= 0
          print self.descriptio n
          if name == 'coordinates':
          self.iscoordina tesElement = 0
          print self.coordinate s
          ....

          I need to read your answer again carefully - I don't know if what I
          did is best - but it seemed to fix it. Thank you for the clear and
          cogent answer.

          Lee G.

          Comment

          • Fredrik Lundh

            #6
            Re: SAX XML Parse Python error message

            goldtech wrote:
            I would be grateful for support with the code I cited. It's not long
            and fairly standard. I'm sure my error(s) would be glaring to more
            experienced coders. I appreciated the "heads-up" about other options
            but I would be grateful for help getting this code to run. Thanks
            For comparison, here's how an experienced Python programmer might prefer
            to write your code:

            import xml.etree.cElem entTree as ET

            description = None # most recently seen description

            for event, elem in ET.parse("somef ile.xml"):
            if elem.tag == "descriptio n":
            description = elem.text
            elif elem.tag == "coordinate s":
            print description.str ip(), elem.text.strip ()

            You may want to ask yourself why you prefer to struggle with obsolete,
            error-prone, and slow technology when there are more efficient tools
            available in Python's standard library.

            (the lxml library that Stefan linked to is a superset of xml.etree, in
            case you want more XML features).

            </F>

            Comment

            Working...