using XmlReader with namespace in a html file

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • mickey0
    New Member
    • Jan 2008
    • 142

    using XmlReader with namespace in a html file

    Hello,
    I must parse something like this:
    Code:
    <html>
    <head></head>
    <body>
    <using xmlns:namespace="myfile" />
    myText
    <p><namespace:myComponent Attribs="val"> //it crashes on 'namespace:myCOmponent'
    </body>
    </html>
    But my xml parser gets error on 'namespace:myCo mponent'. Is there a way to overcome this? The parser is this:
    Code:
     public Parser(string fileContent) {
        _settings.ConformanceLevel = ConformanceLevel.Fragment;
        _settings.IgnoreWhitespace = true;
        _settings.IgnoreComments = true;
        _textReader = XmlReader.Create(fileContent, _settings);
        _textReader.Read();
        XmlNodeType nType = _textReader.NodeType;
        while (_textReader.Read()) {
           switch (_textReader.NodeType) {
           case XmlNodeType.Element: // The node is an element.
           Console.Write("<" + _textReader.Name);
           Console.WriteLine(">");
           break;
           case XmlNodeType.Text: //Display the text in each element.
           Console.WriteLine(_textReader.Value);
           break;
           case XmlNodeType.EndElement: //Display the end of the element. 
           Console.Write("</" + _textReader.Name);       
           Console.WriteLine(">");
           break;
        }
      }
    Thanks
  • mickey0
    New Member
    • Jan 2008
    • 142

    #2
    Hello again,
    no ideas about it?

    Comment

    • Joseph Martell
      Recognized Expert New Member
      • Jan 2010
      • 198

      #3
      Can you be more specific about the error you are receiving? Also, is there a reason you are declaring the namespace in such a fashion instead of as a normal xml/xhtml document:

      Code:
      <html xmlns:namespace="myfile">

      Comment

      • simongh
        New Member
        • Jun 2010
        • 3

        #4
        Presumably in the example above the html document isn't what you used for real as it's incomplete.

        Declaring the namespace on an element other than the root, is perfectly valid xml, so that shouldn't cause any issue.

        Comment

        • Joseph Martell
          Recognized Expert New Member
          • Jan 2010
          • 198

          #5
          True, it is valid, but isn't the namespace scope limited to the element that it is declared in? So using it in a self closing element means the namespace is not valid outside of the "<using..." tag?

          If this is not correct then my apologies. I have limited experience with xml namespaces.

          Comment

          • mickey0
            New Member
            • Jan 2008
            • 142

            #6
            Originally posted by jbm1313
            Can you be more specific about the error you are receiving? Also, is there a reason you are declaring the namespace in such a fashion instead of as a normal xml/xhtml document:

            Code:
            <html xmlns:namespace="myfile">
            That is my xml/html-like tag language. The Exception arises when the xmlReader sees the tag "<namespace:myC omponent" as it says that the namespace hasn't been declared before.
            I know it's not a fine XML but I need to manage that "my-own-tag language"; is there anyway to do that?
            I was thinking something like this:
            Code:
            while (reader.next() ) {
                  switch(reader.NodeType) {
                         case STARTELEMENT:
                                  if (reader.Name == 'using' ) {
                                          //add as namespace in "somewhere" the part that follow "xmlns" in 'using' tag, eg. 'namespace'
                                   }
                            break;
                         case: ....................................
                  }
            }

            Comment

            • simongh
              New Member
              • Jun 2010
              • 3

              #7
              Ha! I missed that bit. That's entirely the problem. The namespace is declared on a self closing element, so is scoped to only that element. When the xmlreader encounters the namespace further down it throws a "namespace undeclared" exception. What you've got otherwise is invalid XML, which no xmlparser can read.

              Change your xml to below or move the namespace declaration to the myComponent element.

              Code:
              <?xml version="1.0" encoding="utf-8" ?>
              <html>
              	<head></head>
              	<body>
              		<using xmlns:namespace="myfile">
              			myText
              			<p>
              				<namespace:myComponent Attribs="val"> //it crashes on 'namespace:myCOmponent'</namespace:myComponent>
              			</p>
              		</using>
              	</body>
              	
              </html>
              or this

              Code:
              <?xml version="1.0" encoding="utf-8" ?>
              <html>
              	<head></head>
              	<body>
              		<using/>
              			myText
              			<p>
              				<namespace:myComponent xmlns:namespace="myfile" Attribs="val"> //it crashes on 'namespace:myCOmponent'</namespace:myComponent>
              			</p>
              	</body>
              	
              </html>

              Comment

              • mickey0
                New Member
                • Jan 2008
                • 142

                #8
                do you assure me that there is no way to keep my namespace way, and manage it in someway at programming language level?

                thanks

                Comment

                • Joseph Martell
                  Recognized Expert New Member
                  • Jan 2010
                  • 198

                  #9
                  I don't believe that you can without resorting to writing your own custom XML classes from scratch. Even if you add the schema to your xml document object programmaticall y (which can be done), the namespace prefix will still throw an exception during parsing because it does not have valid scope in your example case.

                  What you are talking about doing here is diverging from the way that the XML standard handles namespaces. That is fundamentally breaking to the .Net XML objects.

                  If I were you, I would take one of the changes suggested by simongh.

                  Comment

                  • mickey0
                    New Member
                    • Jan 2008
                    • 142

                    #10
                    understood. Can you tell me what do you intend with 'schema'? I mean, if I use the tag </using>, the XMLReader work perfectly; so what do I have to add (with schema I mean)

                    regards.

                    Comment

                    • Joseph Martell
                      Recognized Expert New Member
                      • Jan 2010
                      • 198

                      #11
                      Originally posted by mickey0
                      understood. Can you tell me what do you intend with 'schema'? I mean, if I use the tag </using>, the XMLReader work perfectly; so what do I have to add (with schema I mean)

                      regards.
                      Sorry, my mistake. I meant to say namespace, not schema.

                      Comment

                      • mickey0
                        New Member
                        • Jan 2008
                        • 142

                        #12
                        basically doesn't change my question....... ....

                        Comment

                        • Joseph Martell
                          Recognized Expert New Member
                          • Jan 2010
                          • 198

                          #13
                          Thanks for asking that last question. It made me dig deaper into the .Net XML objects.

                          In order to get your original scenario to work you would have to read the XML as a string, find the "<using..." tags, and then add them to a separate namespace manager that you use for parsing the string. MSDN has an article that shows something similar to what you are talking about.

                          MSDN Article

                          This will result in an XmlReader that reads through your xml file without generating an error.

                          Be aware that if you did something like this:

                          Code:
                          XmlDocument myDoc = new XmlDocument();
                          myDoc.Load(reader);
                          then myDoc.OuterXml will reflect the addition of the new namespaces and you will wind up with a document very similar to version 2 of simongh's suggested fixes.

                          Using the XmlNamespaceMan ager does provide a work-around, but I still think that simongh's suggestions would be more correct because you would start off with readable, portable, and correct XML.

                          Comment

                          • mickey0
                            New Member
                            • Jan 2008
                            • 142

                            #14
                            yes, it should be great; but I repeat this for completeness; are you sure that I can exploit what you say with out know the namespace keyword in advance? Can I embedded what you say here?
                            Code:
                            while (reader.next() ) {
                                  switch(reader.NodeType) {
                                         case STARTELEMENT:
                                                  if (reader.Name == 'using' ) {
                                                          //add as namespace in "somewhere" the part that follow "xmlns" in 'using' tag, eg. 'namespace'
                               ONLY HERE I KNOW THE NAME OF THE NAMESPACE i WILL ENCOUTER LATER, EG. 'NAMESPACE'....
                                                   }
                                            break;
                                         case: ....................................
                                  }
                            }

                            Comment

                            • Joseph Martell
                              Recognized Expert New Member
                              • Jan 2010
                              • 198

                              #15
                              As far as I can tell, no your example code would NOT work. You cannot add new namespaces once the reader object has been instantiated. When I tried that in my example code I received an exception.

                              You would have to pull out the "<using..." tags manually BEFORE you instantiated your reader object. Changes to the XmlNamespaceMan ager object do not affect the XmlReader after the XmlReader has been instantiated.

                              Comment

                              Working...