MS XML Parser error in CData section

!NoItAll

Contributor

Join Date: May 2006

Posts: 297
#1

MS XML Parser error in CData section

Mar 8 '09, 10:04 PM

The MSXML parser is choking on a single character that often appears in my data within a CDATA section.

Try this:

Code:

Dim bRet As Boolean Dim lRet as long Dim xmlDoc As MSXML2.DOMDocument Set xmlDoc = New MSXML2.DOMDocument bRet = xmlDoc.Load("d:\test.xml") lRet = xmlDoc.parseError.filepos 'returns the position of the A9 (copyright symbol)

The file I'm loading looks like this:

<NRCS_2NEWARC RECORDNUMBER= "1844">
<TreeStructur e Data="19930121"/>
<![CDATA[NEWSS © 1993 All Rights Reserved ]]>
</NRCS_2NEWARC>

Which I saved as d:\test.xml. See the A9 (copyright symbol) inside the CData section - MSXML chokes on it every time - If I remove the copyright symbol everything works as expected. Why? I thought a CDATA section was supposed to be passed intact! The only thing you're not supposed to put into a CDATA section is ]]> which terminates it.
This is frustrating!
I've tried it with MSXML 4, 5, and 6.
Tags: None
!NoItAll

Contributor

Join Date: May 2006

Posts: 297
#2

Mar 9 '09, 07:06 PM

ok - I see the problem. That character, standing alone, makes for improper UTF8, which XML is expecting. The correct thing to do is make sure I convert the data to UTF8 first - but that seems stupid to me. Again - I thought CDATA was supposed to go completely uninterpreted so you could put any old garbage in there. Apparently not - it has to be proper garbage....
Comment

MS XML Parser error in CData section

MS XML Parser error in CData section

Comment