How to parse an xml with extended characters?

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • saritha2008
    New Member
    • Sep 2008
    • 9

    How to parse an xml with extended characters?

    Hi,

    I am working on converting one form of xml to other form using XSLT.

    When tried to convert the below xml file using XSLT, "parser error: PCDATA invalid Char value 2" is displayed. Can any one help how to resolve this issue?

    [Code]
    <?xml version="1.0" encoding="UTF-8" ?>
    <!-- RSS generated by JIRA 91 at Thu Oct 23 06:52:25 EDT 2008 -->
    <rss version="0.92">
    <channel>
    <title>Sentin el Issue Tracking System</title>
    <link>http://bugs.esecurity. net:8090</link>
    <description>Th is file is an XML representation of some issues</description>
    <language>en</language>
    <item>
    <description> <![CDATA[
    thiseventtgYe ar2005tgMonth 12tgDay23tg Hour06tgMinut e29tgSecond5 3RN30C2 CNSRAVANEC 529EI529 ET5LSecu ritySNSecur ityTaudit failureUNT AUTHORITY\SYSTE MCSLogon/Logoff MLogon FailureXMLo gon Failure: Reason: Unknown user name or bad password User Name: 0075 Domain: APPLABS Logon Type: 7 Logon Process: User32 Authentication Package: Negotiate Workstation Name: SRAVAN IS0075APPL ABS7User32 NegotiateSRAV AN]]></description>
    </item>
    </channel>
    </rss>

    Encoding of the xml is UTF-8 and my xslt also outputs to an xml with an encoding of UTF-8.

    Any help in this regard is highly appreciated.

    Thanks,
    Saritha
  • Dormilich
    Recognized Expert Expert
    • Aug 2008
    • 8694

    #2
    if your characters are outside the valid range you may consider using XML version 1.1, though even this does not allow all characters
    Originally posted by W3C
    Character Range (version 1.0)
    [2] Char ::= #x9 | #xA | #xD | [#x20-#xD7FF] | [#xE000-#xFFFD] | [#x10000-#x10FFFF]
    /* any Unicode character, excluding the surrogate blocks, FFFE, and FFFF. */
    Originally posted by W3C
    Character Range (version 1.1)
    [2] Char ::= [#x1-#xD7FF] | [#xE000-#xFFFD] | [#x10000-#x10FFFF]
    /* any Unicode character, excluding the surrogate blocks, FFFE, and FFFF. */
    note: of course the surrounding programs have to support xml 1.1 too.

    regards

    Comment

    Working...