Minidom.Node doesn't call __getattr__

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • ArseAssassin
    New Member
    • Mar 2008
    • 5

    Minidom.Node doesn't call __getattr__

    I'm using minidom to parse XML and to simplify accessing child nodes, I'm trying to implement __getattr__ for the Node class. Yes, I know what most of you will think of that; I'll just have to be careful, won't I. ;)

    Code:
    import xml
    from xml.dom.minidom import parseString
    
    def getChild(self, name):
        print "Getting child"
        return self.getElementsByTagName(name)[0]
    
    xml.dom.minidom.Node.__getattr__ = getChild
    
    xmlDoc = parseString('''<?xml version="1.0" encoding="ISO-8859-1"?>
    <CATALOG>
      <cd>
        <title>Empire Burlesque</title>
        <ARTIST>Bob Dylan</ARTIST>
        <COUNTRY>USA</COUNTRY>
        <COMPANY>Columbia</COMPANY>
        <PRICE>10.90</PRICE>
        <YEAR>1985</YEAR>
      </cd>
      <cd>
        <title>Hide your heart</title>
        <ARTIST>Bonnie Tylor</ARTIST>
        <COUNTRY>UK</COUNTRY>
        <COMPANY>CBS Records</COMPANY>
        <PRICE>9.90</PRICE>
        <YEAR>1988</YEAR>
      </cd>
    </CATALOG>''')
    
    xmlRoot = xmlDoc.childNodes[0]
    print xmlRoot.__getattr__
    print xmlRoot.cd.title.childNodes[0].nodeValue
    outputs:
    Code:
    <bound method Element.getChild of <DOM Element: CATALOG at 0x1c50850>>
      File "C:\Users\Tuomas\Projektit\wrestlevania\ezXML.py", line 31, in <module>
        print xmlRoot.cd.title.childNodes[0].nodeValue
    AttributeError: Element instance has no attribute 'cd'
    As far as I can tell, everything works fine, except __getattr__ never gets called. Can anyone help me understand why that is or how to work around it?
  • jlm699
    Contributor
    • Jul 2007
    • 314

    #2
    If you're trying to call __getattr__ you need ().

    Comment

    • ArseAssassin
      New Member
      • Mar 2008
      • 5

      #3
      Originally posted by jlm699
      If you're trying to call __getattr__ you need ().
      Not trying to call it - I'm overriding it.

      Comment

      • jlm699
        Contributor
        • Jul 2007
        • 314

        #4
        Originally posted by ArseAssassin
        Not trying to call it - I'm overriding it.
        No I know that... but you still need the (name) arguments that getChild needs

        Comment

        • ArseAssassin
          New Member
          • Mar 2008
          • 5

          #5
          Originally posted by jlm699
          No I know that... but you still need the (name) arguments that getChild needs
          I'm not sure what you're getting at. getChild is just an implementation of the special method __getattr__, which should be called when I'm trying to access attributes that aren't defined.

          Comment

          • jlm699
            Contributor
            • Jul 2007
            • 314

            #6
            Originally posted by ArseAssassin
            I'm not sure what you're getting at. getChild is just an implementation of the special method __getattr__, which should be called when I'm trying to access attributes that aren't defined.
            But aren't you over-riding __getattr__ with getChild ? The way that your code is setup, instead of __getattr__ being called, it calls for getChild... I don't know, maybe I'm misunderstandin g what you're trying to accomplish in this code... Where does the .cd. method come from since it's not an attribute of the class that you are trying to call it from?

            Comment

            • bvdet
              Recognized Expert Specialist
              • Oct 2006
              • 2851

              #7
              Originally posted by ArseAssassin
              I'm using minidom to parse XML and to simplify accessing child nodes, I'm trying to implement __getattr__ for the Node class. Yes, I know what most of you will think of that; I'll just have to be careful, won't I. ;)

              Code:
              import xml
              from xml.dom.minidom import parseString
              
              def getChild(self, name):
                  print "Getting child"
                  return self.getElementsByTagName(name)[0]
              
              xml.dom.minidom.Node.__getattr__ = getChild
              
              xmlDoc = parseString('''<?xml version="1.0" encoding="ISO-8859-1"?>
              <CATALOG>
                <cd>
                  <title>Empire Burlesque</title>
                  <ARTIST>Bob Dylan</ARTIST>
                  <COUNTRY>USA</COUNTRY>
                  <COMPANY>Columbia</COMPANY>
                  <PRICE>10.90</PRICE>
                  <YEAR>1985</YEAR>
                </cd>
                <cd>
                  <title>Hide your heart</title>
                  <ARTIST>Bonnie Tylor</ARTIST>
                  <COUNTRY>UK</COUNTRY>
                  <COMPANY>CBS Records</COMPANY>
                  <PRICE>9.90</PRICE>
                  <YEAR>1988</YEAR>
                </cd>
              </CATALOG>''')
              
              xmlRoot = xmlDoc.childNodes[0]
              print xmlRoot.__getattr__
              print xmlRoot.cd.title.childNodes[0].nodeValue
              outputs:
              Code:
              <bound method Element.getChild of <DOM Element: CATALOG at 0x1c50850>>
                File "C:\Users\Tuomas\Projektit\wrestlevania\ezXML.py", line 31, in <module>
                  print xmlRoot.cd.title.childNodes[0].nodeValue
              AttributeError: Element instance has no attribute 'cd'
              As far as I can tell, everything works fine, except __getattr__ never gets called. Can anyone help me understand why that is or how to work around it?
              Modify getChild() slightly:[code=Python]def getChild(self, name):
              print "Getting child"
              return self.getElement sByTagName(name )[/code]Modify the calls:[code=Python]xmlRoot = xmlDoc.childNod es[0]
              print xmlRoot.__getat tr__('cd')
              print xmlRoot.__getat tr__('cd')[0].__getattr__('t itle')[0].firstChild.nod eValue
              print xmlRoot.__getat tr__('ARTIST')[0].firstChild.nod eValue[/code]Output:[code=Python]>>> Getting child
              [<DOM Element: cd at 0x100d4b8>, <DOM Element: cd at 0x100d788>]
              Getting child
              Getting child
              Empire Burlesque
              Getting child
              Bob Dylan
              >>> [/code]

              Comment

              • jlm699
                Contributor
                • Jul 2007
                • 314

                #8
                Originally posted by bvdet
                Modify the calls:[code=Python]
                print xmlRoot.__getat tr__('cd')
                [/code]
                That's what I was trying to say by using the (name) argument

                Comment

                • bvdet
                  Recognized Expert Specialist
                  • Oct 2006
                  • 2851

                  #9
                  Originally posted by jlm699
                  That's what I was trying to say by using the (name) argument
                  And you were correct!

                  Comment

                  • jlm699
                    Contributor
                    • Jul 2007
                    • 314

                    #10
                    Originally posted by bvdet
                    [code=Python]xmlRoot = xmlDoc.childNod es[0]
                    print xmlRoot.__getat tr__('cd')
                    print xmlRoot.__getat tr__('cd')[0].__getattr__('t itle')[0].firstChild.nod eValue
                    print xmlRoot.__getat tr__('ARTIST')[0].firstChild.nod eValue[/code]Output:[code=Python]>>> Getting child
                    [<DOM Element: cd at 0x100d4b8>, <DOM Element: cd at 0x100d788>]
                    Getting child
                    Getting child
                    Empire Burlesque
                    Getting child
                    Bob Dylan
                    >>> [/code]
                    In my calls I needed to modify it a little bit...
                    [CODE=python]
                    >>> xmlRoot.__getat tr__('cd').__ge tattr__('title' ).firstChild.no deValue
                    Getting child
                    Getting child
                    u'Empire Burlesque'
                    >>> xmlRoot.__getat tr__('ARTIST'). firstChild.node Value
                    Getting child
                    u'Bob Dylan'
                    >>>
                    [/CODE]

                    Comment

                    • bvdet
                      Recognized Expert Specialist
                      • Oct 2006
                      • 2851

                      #11
                      Originally posted by jlm699
                      In my calls I needed to modify it a little bit...
                      [CODE=python]
                      >>> xmlRoot.__getat tr__('cd').__ge tattr__('title' ).firstChild.no deValue
                      Getting child
                      Getting child
                      u'Empire Burlesque'
                      >>> xmlRoot.__getat tr__('ARTIST'). firstChild.node Value
                      Getting child
                      u'Bob Dylan'
                      >>>
                      [/CODE]
                      I had modified function getChild() to return the NodeList object. Otherwise you could not access the other elements in the object.
                      [code=Python]
                      >>> for elem in xmlRoot.__getat tr__('cd'):
                      ... print elem
                      ...
                      Getting child
                      <DOM Element: cd at 0xf854b8>
                      <DOM Element: cd at 0xf8b490>
                      >>> [/code]

                      Comment

                      • ArseAssassin
                        New Member
                        • Mar 2008
                        • 5

                        #12
                        Originally posted by jlm699
                        In my calls I needed to modify it a little bit...
                        [CODE=python]
                        >>> xmlRoot.__getat tr__('cd').__ge tattr__('title' ).firstChild.no deValue
                        Getting child
                        Getting child
                        u'Empire Burlesque'
                        >>> xmlRoot.__getat tr__('ARTIST'). firstChild.node Value
                        Getting child
                        u'Bob Dylan'
                        >>>
                        [/CODE]
                        Well, obviously that'll work. Here's what I'm going for:

                        Called when an attribute lookup has not found the attribute in the usual places (i.e. it is not an instance attribute nor is it found in the class tree for self).
                        With any custom class I define, it will work:
                        Code:
                        >>> class example:
                        ... 	def __init__(self):
                        ... 		print 'Initiated'
                        ... 		
                        >>> a = example()
                        Initiated
                        >>> def getAttribute(self, name):
                        ... 	print name, 'was not found'
                        ... 	
                        >>> example.__getattr__ = getAttribute
                        >>> print a.unknownAttribute
                        unknownAttribute was not found
                        I'm having trouble understanding why it won't work for minidom nodes.

                        Comment

                        • ArseAssassin
                          New Member
                          • Mar 2008
                          • 5

                          #13
                          I was finally able to figure it out: since the nodes I'm accessing are actually instances of the Element class and __getattr__ isn't part of its parent's (Node class) original class definition, the new instances aren't, I suppose, aware of the method's existence. It's definitely inherited since I can access it; the instances just won't know they're supposed to call it. That's what I figure anyway. Here's the working code:
                          Code:
                          import xml
                          from xml.dom.minidom import parseString
                          
                          def getChild(self, name):
                              print "Getting child"
                              return self.getElementsByTagName(name)[0]
                          
                          xml.dom.minidom.Element.__getattr__ = getChild
                          
                          xmlDoc = parseString('''<?xml version="1.0" encoding="ISO-8859-1"?>
                          <CATALOG>
                            <cd>
                              <title>Empire Burlesque</title>
                              <ARTIST>Bob Dylan</ARTIST>
                              <COUNTRY>USA</COUNTRY>
                              <COMPANY>Columbia</COMPANY>
                              <PRICE>10.90</PRICE>
                              <YEAR>1985</YEAR>
                            </cd>
                            <cd>
                              <title>Hide your heart</title>
                              <ARTIST>Bonnie Tylor</ARTIST>
                              <COUNTRY>UK</COUNTRY>
                              <COMPANY>CBS Records</COMPANY>
                              <PRICE>9.90</PRICE>
                              <YEAR>1988</YEAR>
                            </cd>
                          </CATALOG>''')
                          
                          xmlRoot = xmlDoc.childNodes[0]
                          print xmlRoot.__getattr__
                          print xmlRoot.cd.title.childNodes[0].nodeValue
                          Thanks for the help, guys. :)

                          Comment

                          • bvdet
                            Recognized Expert Specialist
                            • Oct 2006
                            • 2851

                            #14
                            Originally posted by ArseAssassin
                            I was finally able to figure it out: since the nodes I'm accessing are actually instances of the Element class and __getattr__ isn't part of its parent's (Node class) original class definition, the new instances aren't, I suppose, aware of the method's existence. It's definitely inherited since I can access it; the instances just won't know they're supposed to call it. That's what I figure anyway. Here's the working code:
                            Code:
                            import xml
                            from xml.dom.minidom import parseString
                            
                            def getChild(self, name):
                                print "Getting child"
                                return self.getElementsByTagName(name)[0]
                            
                            xml.dom.minidom.Element.__getattr__ = getChild
                            
                            xmlDoc = parseString('''<?xml version="1.0" encoding="ISO-8859-1"?>
                            <CATALOG>
                              <cd>
                                <title>Empire Burlesque</title>
                                <ARTIST>Bob Dylan</ARTIST>
                                <COUNTRY>USA</COUNTRY>
                                <COMPANY>Columbia</COMPANY>
                                <PRICE>10.90</PRICE>
                                <YEAR>1985</YEAR>
                              </cd>
                              <cd>
                                <title>Hide your heart</title>
                                <ARTIST>Bonnie Tylor</ARTIST>
                                <COUNTRY>UK</COUNTRY>
                                <COMPANY>CBS Records</COMPANY>
                                <PRICE>9.90</PRICE>
                                <YEAR>1988</YEAR>
                              </cd>
                            </CATALOG>''')
                            
                            xmlRoot = xmlDoc.childNodes[0]
                            print xmlRoot.__getattr__
                            print xmlRoot.cd.title.childNodes[0].nodeValue
                            Thanks for the help, guys. :)
                            You are welcome, and thank you for sharing what you have found out.

                            Comment

                            Working...