XMLTextReader.Read()

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • midhunmathe
    New Member
    • Mar 2007
    • 4

    XMLTextReader.Read()

    Hello,

    I get an error
    #', hexadecimal value 0x07, is an invalid character. Line 2, position 6358.
    on the XMLTextReader.R ead() call on a particular node of my XML document.

    I need this data to be parsed. This is a cyrillic character data and my comp has a latin code page 1252. Is XML parsing dependant on these values? If yes, is there any means by which I can parse the unicode characters getting into my system?

    Any help is appriciated.

    Thanks and warm regards,
    - Midhun George
  • vijaydiwakar
    Contributor
    • Feb 2007
    • 579

    #2
    Originally posted by midhunmathe
    Hello,

    I get an error
    #', hexadecimal value 0x07, is an invalid character. Line 2, position 6358.
    on the XMLTextReader.R ead() call on a particular node of my XML document.

    I need this data to be parsed. This is a cyrillic character data and my comp has a latin code page 1252. Is XML parsing dependant on these values? If yes, is there any means by which I can parse the unicode characters getting into my system?

    Any help is appriciated.

    Thanks and warm regards,
    - Midhun George
    this error occures when thy xml contains any unwanted char in the filed name tag check it

    Comment

    • midhunmathe
      New Member
      • Mar 2007
      • 4

      #3
      Originally posted by vijaydiwakar
      this error occures when thy xml contains any unwanted char in the filed name tag check it
      Hello Vijay,

      I know that. the #x07 character cannot be parsed by the XMLText reader. I want to know the reason. I want to if there is any possibility to parse this. If we are not able to parse this, then I want to know about the characters that are not parsed.

      This is because, my application cannot guarentee the characters coming in to be parsed as the data is sent to my application by another unicode application. There can be Chinese, Japanese, and all sorts of other characters coming in.The application which runs on a system with codepage 1252. I need to know whether it is possible to configure the system to accept all characters even if they are junk rather than throwing any error (i did not mean exception handling).

      Thanks...

      Comment

      • kenobewan
        Recognized Expert Specialist
        • Dec 2006
        • 4871

        #4
        I believe that the problem is the ASCII "bell" byte code (#), and that when # is in the code XMLTextReader is expecting an ASCII char. Have you tried replacing it with its ASCII equivalent - & # 3 5 ;?

        Here is a list of potential problems that I have found:
        Hex Value Explanation
        0x01 Start of Heading
        0x02 Start of Text
        0x03 End of Text
        0x04 End of Transmission
        0x05 Enquiry
        0x06 Acknowledge
        0x07 Bell
        0x08 Backspace
        0x0B Vertical Tabulation
        0x0C Form Feed
        0x0E Shift Out
        0x0F Shift In
        0x10 Data Link Escape
        0x11 Device Control One
        0x12 Device Control Two
        0x13 Device Control Three
        0x14 Device Control Four
        0x15 Negative Acknowledge
        0x16 Synchronous Idle
        0x17 End of Transmission Block
        0x18 Cancel
        0x19 End of Medium

        Comment

        • midhunmathe
          New Member
          • Mar 2007
          • 4

          #5
          Thanks a lot. Now I get the actual problem.

          XMLTextReader.R ead() Throws an exception when an unwanted character like the ones mentioned above comes in the stream. How can I ignore this?

          This is the pseudo code
          Code:
          try
          {[INDENT]reader = new XmlTextReader(filename);[/INDENT][INDENT]reader.WhitespaceHandling = WhitespaceHandling.None;[/INDENT][INDENT]while (true) [/INDENT][INDENT]{[/INDENT][INDENT][INDENT]try[/INDENT][/INDENT][INDENT][INDENT]{[/INDENT][/INDENT][INDENT][INDENT][INDENT]if (!reader.Read())[/INDENT][/INDENT][/INDENT][INDENT][INDENT][INDENT][INDENT]break;[/INDENT][/INDENT][/INDENT][/INDENT][INDENT][INDENT][INDENT]// Some processing[/INDENT][/INDENT][/INDENT][INDENT][INDENT]}[/INDENT][/INDENT][INDENT][INDENT]catch (XMLException XML_exc)[/INDENT][/INDENT][INDENT][INDENT]{[/INDENT][/INDENT][INDENT][INDENT][INDENT]//*** NEED TO DO SOMETHING TO SKIP THIS NODE ***[/INDENT][/INDENT][/INDENT][INDENT][INDENT][INDENT]//*** ELSE THIS IS AN INFINITE LOOP[/INDENT][/INDENT][/INDENT][INDENT][INDENT][INDENT]continue;[/INDENT][/INDENT][/INDENT][INDENT][INDENT]}[/INDENT][/INDENT][INDENT]}[/INDENT]
          }
          catch (Exception e) //other exceptions
          {[INDENT]//some processing[/INDENT]
          }
          finally
          {[INDENT]if (reader != null)[/INDENT][INDENT][INDENT]reader.Close()[/INDENT][/INDENT]
          }
          Last edited by kenobewan; Mar 15 '07, 02:23 AM. Reason: Add code tags

          Comment

          • beanwa
            New Member
            • May 2009
            • 1

            #6
            String Cleanup

            Here's a handy string cleansing class to get rid of the invalid characters before sending them to your xml file. I got a little creative naming the Hashtable.

            Code:
            string cleanString = BytesSite.StringHelp.ValidString(possibleBadString);
            
            -----------------------------------------------------------------------------------------------------------
            
            using System;
            using System.Collections.Generic;
            using System.Linq;
            using System.Text;
            using System.Collections;
            
            namespace BytesSite
            {
                internal class StringHelp
                {
                    static Hashtable Guitar = new Hashtable();
            
                    static string convertAsciiToHex(String pAsciiText) 
                    { 
                        StringBuilder sBuffer = new StringBuilder();
                        for (int i = 0; i < pAsciiText.Length; i++) 
                        {
                            sBuffer.Append(Convert.ToInt32(pAsciiText[i]).ToString("x")); 
                        } 
                        return sBuffer.ToString().ToUpper(); 
                    }
            
                    static bool invalidHex(string pHexValue)
                    {
                        populateGuitar();
                        return Guitar.ContainsKey(pHexValue);
                    }
            
                    public static string ValidString(string pString)
                    {
                        string returnString = string.Empty;
                        for (int i = 0; i < pString.Length; i++)
                        {
                            if (!invalidHex(convertAsciiToHex(pString[i].ToString())))
                                returnString += pString[i].ToString();
                        }
            
                        return returnString;
                    }
            
                    private static void populateGuitar()
                    {
                        if (Guitar.Count == 0)
                        {
                            Guitar.Add("1", "1");
                            Guitar.Add("2", "2");
                            Guitar.Add("3", "3");
                            Guitar.Add("4", "4");
                            Guitar.Add("5", "5");
                            Guitar.Add("6", "6");
                            Guitar.Add("7", "7");
                            Guitar.Add("8", "8");
                            Guitar.Add("10", "10");
                            Guitar.Add("11", "11");
                            Guitar.Add("12", "12");
                            Guitar.Add("13", "13");
                            Guitar.Add("14", "14");
                            Guitar.Add("15", "15");
                            Guitar.Add("16", "16");
                            Guitar.Add("17", "17");
                            Guitar.Add("18", "18");
                            Guitar.Add("19", "19");
                            Guitar.Add("B", "B");
                            Guitar.Add("C", "C");
                            Guitar.Add("E", "E");
                            Guitar.Add("F", "F");
                        }
                    }
                }
            }
            Last edited by PRR; May 28 '09, 07:10 AM. Reason: Please post code in [code] [/code] tags.

            Comment

            Working...