A bug in .Net Binary Serialization?

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • =?Utf-8?B?enRSb24=?=

    A bug in .Net Binary Serialization?

    Hi all,

    I recently came across something really strange and after a couple of days
    of debugging, I finally nailed the cause of it. However, I have absolutely no
    idea what I am doing wrong or is it just a bug in binary serialization. The
    following is a simple example of the code:




    using System;
    using System.Collecti ons.Generic;
    using System.IO;
    using System.Runtime. Serialization.F ormatters.Binar y;

    namespace ConsoleApplicat ion5
    {
    class Program
    {
    static void Main(string[] args)
    {
    A a = new A();
    B b = new B(a);
    List<CcList = new List<C>();
    for (int i = 0; i < 10000; i++)
    {
    cList.Add(new C("someValue")) ;
    }
    b.CList = cList;

    MemoryStream stream = new MemoryStream();
    BinaryFormatter objFormatter = new BinaryFormatter ();
    objFormatter.Se rialize(stream, b);
    }
    }

    [Serializable]
    class A
    {
    private Dictionary<stri ng, string_dic1 = new Dictionary<stri ng,
    string>();

    public A()
    {
    _dic1.Add("key1 ", "value1");
    _dic1.Add("key2 ", "value2");
    }
    }

    [Serializable]
    class B
    {
    private List<C_cList = new List<C>();
    private A _a;

    public B(A a)
    {
    _a = a;
    }

    public List<CCList
    {
    get { return _cList; }
    set { _cList = value; }
    }
    }

    [Serializable]
    class C
    {
    private Dictionary<stri ng, string_dic2 = new Dictionary<stri ng,
    string>();
    private string _value;

    public C(string value)
    {
    _value = value;
    }
    }
    }































































    If you run the code, you will find that the stream has a length of 4,532,517
    bytes. Now, try changing _dic1(Class A) to be a Dictionary<stri ng, object>
    and run the code again. Now, the stream length is 462,924 bytes. Why is there
    such a big difference just by changing the type? What I noticed also was that
    this might be due to the fact that I have another dictionary of the same type
    in Class C.

    Am I doing something wrong here? If not, is this a bug?

    Thanks in advance!!

  • Peter Duniho

    #2
    Re: A bug in .Net Binary Serialization?

    On Tue, 01 Jul 2008 22:50:00 -0700, ztRon
    <ztRon@discussi ons.microsoft.c omwrote:
    [...]
    If you run the code, you will find that the stream has a length of
    4,532,517
    bytes. Now, try changing _dic1(Class A) to be a Dictionary<stri ng,
    object>
    and run the code again. Now, the stream length is 462,924 bytes. Why is
    there
    such a big difference just by changing the type? What I noticed also was
    that
    this might be due to the fact that I have another dictionary of the same
    type
    in Class C.
    >
    Am I doing something wrong here? If not, is this a bug?
    I'll vote bug. But I admit, I'm no serialization expert so I might be
    missing something.

    But I do agree that it seems remarkable that such a simple change in the
    type parameter for a Dictionary<TKey , TValueinstance would produce such
    a dramatic difference. And I have in fact confirmed the behavior (albeit
    with slightly different numbers as the output...but the relative scale is
    the same).

    It would be interesting to try to serialize to a more readable format and
    see what the specific differences are. I don't have the time at the
    moment to explore too much, but it's something you might like to try.

    Pete

    Comment

    • =?Utf-8?B?enRSb24=?=

      #3
      RE: A bug in .Net Binary Serialization?

      Sorry my post seems to have a huge white space in between. Reposting it below:

      Hi all,

      I recently came across something really strange and after a couple of days
      of debugging, I finally nailed the cause of it. However, I have absolutely no
      idea what I am doing wrong or is it just a bug in binary serialization. The
      following is a simple example of the code:

      using System;
      using System.Collecti ons.Generic;
      using System.IO;
      using System.Runtime. Serialization.F ormatters.Binar y;

      namespace ConsoleApplicat ion5
      {
      class Program
      {
      static void Main(string[] args)
      {
      A a = new A();
      B b = new B(a);
      List<CcList = new List<C>();
      for (int i = 0; i < 10000; i++)
      {
      cList.Add(new C("someValue")) ;
      }
      b.CList = cList;

      MemoryStream stream = new MemoryStream();
      BinaryFormatter objFormatter = new BinaryFormatter ();
      objFormatter.Se rialize(stream, b);
      }
      }

      [Serializable]
      class A
      {
      private Dictionary<stri ng, string_dic1 = new Dictionary<stri ng,
      string>();

      public A()
      {
      _dic1.Add("key1 ", "value1");
      _dic1.Add("key2 ", "value2");
      }
      }

      [Serializable]
      class B
      {
      private List<C_cList = new List<C>();
      private A _a;

      public B(A a)
      {
      _a = a;
      }

      public List<CCList
      {
      get { return _cList; }
      set { _cList = value; }
      }
      }

      [Serializable]
      class C
      {
      private Dictionary<stri ng, string_dic2 = new Dictionary<stri ng,
      string>();
      private string _value;

      public C(string value)
      {
      _value = value;
      }
      }
      }

      If you run the code, you will find that the stream has a length of 4,532,517
      bytes. Now, try changing _dic1(Class A) to be a Dictionary<stri ng, object>
      and run the code again. Now, the stream length is 462,924 bytes. Why is there
      such a big difference just by changing the type? What I noticed also was that
      this might be due to the fact that I have another dictionary of the same type
      in Class C.

      Am I doing something wrong here? If not, is this a bug?

      Thanks in advance!!

      Comment

      • =?Utf-8?B?enRSb24=?=

        #4
        Re: A bug in .Net Binary Serialization?

        It would be interesting to try to serialize to a more readable format and
        see what the specific differences are. I don't have the time at the
        moment to explore too much, but it's something you might like to try.
        This example was actually derived from a more complex code if that was what
        you meant. And in my unit testing of it, I noticed that the size recently
        tripled due to the addition of one dictionary even though there is only ever
        one instance of it. This was when I started to debug and finally were able to
        pinpoint its cause and came out with a simpler example to express this
        problem.

        Comment

        • Peter Duniho

          #5
          Re: A bug in .Net Binary Serialization?

          On Wed, 02 Jul 2008 00:20:01 -0700, ztRon
          <ztRon@discussi ons.microsoft.c omwrote:
          >It would be interesting to try to serialize to a more readable format
          >and
          >see what the specific differences are. I don't have the time at the
          >moment to explore too much, but it's something you might like to try.
          >
          This example was actually derived from a more complex code if that was
          what
          you meant.
          No, it's not. The code you posted was fine. I'm talking about the
          resulting data itself. Serialize less, and to a format like SOAP so that
          you can take the two alternatives and inspect them as text files
          side-by-side. That should give you some clues as to what differences
          exist between the two. And that _might_ lead you to some useful
          conclusion as to why such a simple change produces such a dramatic
          difference.

          If you can accomplish that with the output from the BinaryFormatter , more
          power to you. :) But I'd go with a text-format serialization. I naïvely
          tried to swap in an XmlSerializer for the BinaryFormatter , but of course
          it has different requirements from the regular serialization stuff (for
          one, it requires everything to be public that's going to be serialized).
          I didn't have the time to make the necessary adjustments, but that could
          be something you might try, since the output from the XmlSerializer is yet
          again much more readable than SOAP.

          Pete

          Comment

          • =?Utf-8?B?enRSb24=?=

            #6
            Re: A bug in .Net Binary Serialization?

            The problem with something like the XmlSerializer is that it does not support
            serialization of dictionaries.

            Does anyone else have any other ideas to this problem?

            Thanks.

            "Peter Duniho" wrote:
            On Wed, 02 Jul 2008 00:20:01 -0700, ztRon
            <ztRon@discussi ons.microsoft.c omwrote:
            >
            It would be interesting to try to serialize to a more readable format
            and
            see what the specific differences are. I don't have the time at the
            moment to explore too much, but it's something you might like to try.
            This example was actually derived from a more complex code if that was
            what
            you meant.
            >
            No, it's not. The code you posted was fine. I'm talking about the
            resulting data itself. Serialize less, and to a format like SOAP so that
            you can take the two alternatives and inspect them as text files
            side-by-side. That should give you some clues as to what differences
            exist between the two. And that _might_ lead you to some useful
            conclusion as to why such a simple change produces such a dramatic
            difference.
            >
            If you can accomplish that with the output from the BinaryFormatter , more
            power to you. :) But I'd go with a text-format serialization. I naïvely
            tried to swap in an XmlSerializer for the BinaryFormatter , but of course
            it has different requirements from the regular serialization stuff (for
            one, it requires everything to be public that's going to be serialized).
            I didn't have the time to make the necessary adjustments, but that could
            be something you might try, since the output from the XmlSerializer is yet
            again much more readable than SOAP.
            >
            Pete
            >

            Comment

            • Peter Duniho

              #7
              Re: A bug in .Net Binary Serialization?

              On Wed, 02 Jul 2008 18:17:10 -0700, ztRon
              <ztRon@discussi ons.microsoft.c omwrote:
              The problem with something like the XmlSerializer is that it does not
              support
              serialization of dictionaries.
              Not by default, no. You can customize it though. Of course, it's
              entirely possible that the problem is related to the default, automatic
              serialization of dictionaries, in which case customizing XmlSerializer to
              serialize your dictionaries won't help.
              Does anyone else have any other ideas to this problem?
              Well, like I said, SOAP is also basically text-based and you can use that
              as easily as BinaryFormatter .

              Pete

              Comment

              • =?Utf-8?B?enRSb24=?=

                #8
                Re: A bug in .Net Binary Serialization?

                But isn't it the same with SOAP? I think SOAP does not support Generics which
                thus means that it doesn't support dictionaries?

                Well, like I said, SOAP is also basically text-based and you can use that
                as easily as BinaryFormatter .
                >
                Pete
                >

                Comment

                • Peter Duniho

                  #9
                  Re: A bug in .Net Binary Serialization?

                  On Wed, 02 Jul 2008 21:06:00 -0700, ztRon
                  <ztRon@discussi ons.microsoft.c omwrote:
                  But isn't it the same with SOAP? I think SOAP does not support Generics
                  which
                  thus means that it doesn't support dictionaries?
                  My recollection is that it does. I admit, I haven't tested it recently to
                  make sure. But I am under the impression that you can just swap in
                  SoapFormatter where you have BinaryFormatter , and it will "just work".

                  If I'm wrong, well...it should only take you a few minutes to find out. :)

                  Pete

                  Comment

                  • =?Utf-8?B?enRSb24=?=

                    #10
                    Re: A bug in .Net Binary Serialization?

                    I actually did that yesterday using the SOAPFormatter and it did not work,
                    which was why I thought maybe you meant something else.

                    "Peter Duniho" wrote:
                    On Wed, 02 Jul 2008 21:06:00 -0700, ztRon
                    <ztRon@discussi ons.microsoft.c omwrote:
                    >
                    But isn't it the same with SOAP? I think SOAP does not support Generics
                    which
                    thus means that it doesn't support dictionaries?
                    >
                    My recollection is that it does. I admit, I haven't tested it recently to
                    make sure. But I am under the impression that you can just swap in
                    SoapFormatter where you have BinaryFormatter , and it will "just work".
                    >
                    If I'm wrong, well...it should only take you a few minutes to find out. :)
                    >
                    Pete
                    >

                    Comment

                    • Peter Duniho

                      #11
                      Re: A bug in .Net Binary Serialization?

                      On Wed, 02 Jul 2008 22:28:00 -0700, ztRon
                      <ztRon@discussi ons.microsoft.c omwrote:
                      I actually did that yesterday using the SOAPFormatter and it did not
                      work,
                      which was why I thought maybe you meant something else.
                      Okay. Well, since I posted that message and now I had a chance to try it
                      myself, and found the same thing you did. :)

                      I find it ironic, in a very unfortunate way, that the various
                      serialization techniques in .NET are basically incompatible with each
                      other. That is, there is no uniform serialization paradigm that allows
                      "pluggable" formatters.

                      Oh well. Sorry I wasn't of any help. Though, perhaps it was at least
                      some help to have someone validate your findings. :)

                      If you do make progress on the issue, please post your results here so
                      that others can benefit from the experience.

                      Thanks,
                      Pete

                      Comment

                      • SMJT

                        #12
                        Re: A bug in .Net Binary Serialization?

                        On Jul 2, 6:50 am, ztRon <zt...@discussi ons.microsoft.c omwrote:
                        Hi all,
                        >
                        I recently came across something really strange and after a couple of days
                        of debugging, I finally nailed the cause of it. However, I have absolutely no
                        idea what I am doing wrong or is it just a bug in binary serialization. The
                        following is a simple example of the code:
                        >
                        using System;  
                        using System.Collecti ons.Generic;  
                        using System.IO;  
                        using System.Runtime. Serialization.F ormatters.Binar y;  
                        >
                        namespace ConsoleApplicat ion5  
                        {  
                            class Program  
                            {  
                                static void Main(string[] args)  
                                {  
                                    A a = new A();  
                                    B b = new B(a);  
                                    List<CcList = new List<C>();  
                                    for (int i = 0; i < 10000; i++)  
                                    {  
                                        cList.Add(new C("someValue")) ;  
                                    }  
                                    b.CList = cList;  
                        >
                                    MemoryStream stream = new MemoryStream();  
                                    BinaryFormatter objFormatter = new BinaryFormatter ();  
                                    objFormatter.Se rialize(stream, b);  
                                }  
                            }  
                        >
                            [Serializable]  
                            class A  
                            {  
                                private Dictionary<stri ng, string_dic1 = new Dictionary<stri ng,
                        string>();  
                        >
                                public A()  
                                {  
                                    _dic1.Add("key1 ", "value1");  
                                    _dic1.Add("key2 ", "value2");  
                                }  
                            }  
                        >
                            [Serializable]  
                            class B  
                            {  
                                private List<C_cList = new List<C>();  
                                private A _a;  
                        >
                                public B(A a)  
                                {  
                                    _a = a;  
                                }  
                        >
                                public List<CCList  
                                {  
                                    get { return _cList; }  
                                    set { _cList = value; }  
                                }  
                            }  
                        >
                            [Serializable]  
                            class C  
                            {  
                                private Dictionary<stri ng, string_dic2 = new Dictionary<stri ng,
                        string>();  
                                private string _value;  
                        >
                                public C(string value)  
                                {  
                                    _value = value;  
                                }  
                            }  
                        >
                        }  
                        >
                        If you run the code, you will find that the stream has a length of 4,532,517
                        bytes. Now, try changing _dic1(Class A) to be a Dictionary<stri ng, object>
                        and run the code again. Now, the stream length is 462,924 bytes. Why is there
                        such a big difference just by changing the type? What I noticed also was that
                        this might be due to the fact that I have another dictionary of the same type
                        in Class C.
                        >
                        Am I doing something wrong here? If not, is this a bug?
                        >
                        Thanks in advance!!
                        ztRon,

                        I don't think this is a bug, but just the way the data is stored when
                        you use binary serialization.

                        I had a similar problem when serializing classes to a file, where my
                        class contained an array of strings. If the string values were all the
                        same, then only one copy of the string was stored rather than multiple
                        copies of the same string (which I think is quiet clever really, saves
                        space and is probably quicker or something).

                        My bug was that when I changed one of the strings the serialized class
                        size changed so shouldn't have been writen back to the same slot in my
                        file and I ended up corrupting my data file.

                        So I think you don't have a bug, just a feature of binary
                        serialization.

                        SMJT

                        Comment

                        • not_a_commie

                          #13
                          Re: A bug in .Net Binary Serialization?

                          I think you'll have better luck with the newer DataContractSer ializer.
                          It works way better than the older serialization stuff. Here's a cut
                          from my code.

                          public byte[] GetDataBytes(pa rams Type[] types)
                          {
                          var ds = new DataContractSer ializer(GetType (), types);
                          using (var mem = new MemoryStream())
                          {
                          //using (var w = XmlDictionaryWr iter.CreateText Writer(mem)) // for
                          xml
                          using (var w = XmlDictionaryWr iter.CreateBina ryWriter(mem))
                          {
                          ds.WriteObject( w, this);
                          }
                          return mem.ToArray();
                          }
                          }

                          And I prefer the DataContract and DataMember attributes more than the
                          Serializable one.

                          Comment

                          Working...