UTF-8 simple mess

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • JohnIdol
    New Member
    • Apr 2007
    • 21

    UTF-8 simple mess

    Hi All,

    I've been trying to figure this out, but no luck.

    I have an xml file with UTF-8 encoding.
    I am parsing it so my C# strings are showing the unicode (UTF-16) representation for the UTF-8 encoding.

    i.e. ' instead of '

    Now, I am trying this:

    Code:
    byte[] bytes;
    byte[] uniBytes;
    
    bytes = Encoding.UTF8.GetBytes(myString);
    uniBytes = Encoding.Convert(Encoding.UTF8, Encoding.Unicode, bytes);
    myString = System.Text.Encoding.Unicode.GetString(uniBytes);
    Now, I am getting chinese stuff like this.
    If I put ASCII instead of Unicode when asking for the string I get the same mess I started from.

    anyone can help?

    Thanks in advance,

    JI
  • JohnIdol
    New Member
    • Apr 2007
    • 21

    #2
    Probably it is better to add this:

    I am retrieving mails from gmail from the xml output of the atom mail feed



    the strange thing is that in that page as well there are problems with double quotes and single quotes (i am sending mails to myself to test that) even if
    the browser display options are set to UTF-8.

    So I need to state if I am causing the problem or they are, as I seem to be retrieving the same data I can see on the xml output.

    thanks,

    JI

    Comment

    Working...