Great guys:
As a newbie, I'm trying to simply parse a xml file using minidom, but
I don't know why I get some extra children(?). I don't know what is
wrong in xml file, but I've tried different xml files, still same
problem.
*************** *************** *************** *************** *************** ***
xml file (fileTest) looks like:
<?xml version="1.0" encoding="ISO-8859-1" ?>
<afc xmlns="http://python.org/:aaa" xmlns:afc="http ://
python.org/:foo">
<afc:Bibliograp hy>
<File version="2.0.0. 0" publicationDate ="2007-02-16
11:23:06+01:00" />
<Revision version="2" />
<Application version="02.00. 00" />
</afc:Bibliograph y>
</afc>
*************** *************** *************** *************** *************** ***
Python file looks like:
from xml.dom import minidom
doc = minidom.parse(f ileTest)
a= doc.documentEle ment.childNodes
print a
print '--------------'
for item in a:
print item.nodeName
*************** *************** *************** *************** *************** ***
And output is:
[<DOM Text node "\n">, <DOM Element: afc:Bibliograph y at 12082960>,
<DOM Text node "\n">]
--------------
#text
afc:Bibliograph y
#text
*************** *************** *************** *************** *************** ***
My question is why this <DOM Text node "\n"or #text has been
created and how to get rid of them by changing python code? (here I'm
not interested to change xml file.)
Have search the forum without finding any solution :-(
Thank you to all in advance!!
/Ben
As a newbie, I'm trying to simply parse a xml file using minidom, but
I don't know why I get some extra children(?). I don't know what is
wrong in xml file, but I've tried different xml files, still same
problem.
*************** *************** *************** *************** *************** ***
xml file (fileTest) looks like:
<?xml version="1.0" encoding="ISO-8859-1" ?>
<afc xmlns="http://python.org/:aaa" xmlns:afc="http ://
python.org/:foo">
<afc:Bibliograp hy>
<File version="2.0.0. 0" publicationDate ="2007-02-16
11:23:06+01:00" />
<Revision version="2" />
<Application version="02.00. 00" />
</afc:Bibliograph y>
</afc>
*************** *************** *************** *************** *************** ***
Python file looks like:
from xml.dom import minidom
doc = minidom.parse(f ileTest)
a= doc.documentEle ment.childNodes
print a
print '--------------'
for item in a:
print item.nodeName
*************** *************** *************** *************** *************** ***
And output is:
[<DOM Text node "\n">, <DOM Element: afc:Bibliograph y at 12082960>,
<DOM Text node "\n">]
--------------
#text
afc:Bibliograph y
#text
*************** *************** *************** *************** *************** ***
My question is why this <DOM Text node "\n"or #text has been
created and how to get rid of them by changing python code? (here I'm
not interested to change xml file.)
Have search the forum without finding any solution :-(
Thank you to all in advance!!
/Ben
Comment