Strange behaviour from XmlReader

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • Jan Obrestad

    Strange behaviour from XmlReader

    Hello,

    I've been using the XmlReader class to read xml files.
    Lately it has had some strange quirks. It seems to ignore elements when
    there is no whitespace between them.

    ex
    <main><textline ><textelement>A </textelement>
    <textelement> B</textelement<tex telement>C</textelement></textline></main>
    (really on one line)

    works fine, but with

    <main><textline ><textelement>A </textelement><te xtelement>B</textelement><te xtelement>C</textelement></textline></main>

    it skips element B.

    and with
    <main><textline ><textelement>A </textelement>
    <textelement> B</textelement><te xtelement>C</textelement></textline></main>

    it skips element C

    A short program to demonstrate the problem:

    using System;
    using System.Xml;
    using System.IO;

    namespace XmlTest
    {
    class TestXml
    {
    public static void Main()
    {
    string xml =
    "<main><textlin e><textelement> A</textelement><te xtelement>B</textelement><te xtelement>C</textelement></textline></main>";
    StringReader strReader = new StringReader(xm l);
    XmlReader reader = XmlReader.Creat e(strReader);
    reader.MoveToCo ntent();
    int i = 0;
    while (!(reader.NodeT ype == XmlNodeType.End Element &&
    reader.Name.Equ als("textline") ))
    {
    reader.Read();
    if (reader.NodeTyp e == XmlNodeType.Ele ment &&
    reader.Name.Equ als("textelemen t"))
    {
    string text = reader.ReadElem entString();
    Console.WriteLi ne("{0} {1}", i, text);
    i++;
    }
    }
    Console.ReadLin e();
    }
    }
    }

    This now prints
    0 A
    1 C

    It should have printed
    0 A
    1 B
    2 C

    Does anyone have any idea what I might be doing wrong?
    I had always thought that whitespace was irrelevant in xml.


    Jan Obrestad

  • Martin Honnen

    #2
    Re: Strange behaviour from XmlReader

    Jan Obrestad wrote:
    string xml =
    "<main><textlin e><textelement> A</textelement><te xtelement>B</textelement><te xtelement>C</textelement></textline></main>";
    >
    StringReader strReader = new StringReader(xm l);
    XmlReader reader = XmlReader.Creat e(strReader);
    reader.MoveToCo ntent();
    Now the reader is positioned on the 'main' start tag.
    int i = 0;
    while (!(reader.NodeT ype == XmlNodeType.End Element &&
    reader.Name.Equ als("textline") ))
    {
    reader.Read();
    if (reader.NodeTyp e == XmlNodeType.Ele ment &&
    reader.Name.Equ als("textelemen t"))
    {
    string text = reader.ReadElem entString();
    Console.WriteLi ne("{0} {1}", i, text);
    i++;
    }
    }
    For the first run of the loop:
    reader.Read() moves reader to the 'textline' start tag.
    On the second run of the loop:
    reader.Read() moves reader to the 'textelement' start tag.
    reader.ReadElem entString() moves reader past the 'textelement' end
    tag, meaning it is now positioned on the second 'textelement' start tag.
    Output: i as 0, text as 'A'.
    i set to 1
    On the third run of the loop:
    reader.Read() moves reader to the text node with contents 'B'.
    So there you have the problem, your combination of
    Read/ReadElementStri ng and your conditions fail to find the second
    'textelement' start tag.

    As a solution you might want to use ReadString() instead of
    ReadElementStri ng().


    --

    Martin Honnen --- MVP XML

    Comment

    • Jan Obrestad

      #3
      Re: Strange behaviour from XmlReader

      Martin Honnen wrote:
      As a solution you might want to use ReadString() instead of
      ReadElementStri ng().
      That fixed it.
      Thank you!

      Jan

      Comment

      Working...