object serialization

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • William

    object serialization

    I'm looking for an example that would show how to serialize a c++
    object at it's simplest w/o using any other api's. I have a class that
    I want to serialize and then pass to my obj-c class so I can send it
    over the wire.

    I'm just looking for how to serialize it, then pack it back up on the
    other end.

    Any help much appreciated.

  • Ron AF Greve

    #2
    Re: object serialization

    Hi,

    This is how I do it:

    Derive every class from an ISerialize class.
    This ISerialize class has one member accepting an IArchive i.e Serialize(
    IArchive& Archive ).
    The ISerialize class has a virtual function that returns it name (overridden
    in the derived classes to return the class name). The name is used to
    deserialize the correct class.

    The IArchive function knows whether it is open for reading or writing.
    The IArchive class has two virtual members read( char *, long length ) and a
    similar write().
    The IArchive also contains non virtual functions for all regular types like
    string long int (use templates etc to reduce these to a few functions).
    In addition this IArchive can serialize/deserialize ISerialize derived
    classes by first storing there name and then calling the class'es Serialize
    function passing itself. On deserialization it first reads the class name
    (it knows this is a ISerialize class and not another type like long int etc.
    because you pass a pointer or reference to it.)
    Then it creates the object (lookup 'object factory' this is pretty standard
    C++ way of creating objects by name or id) and calls the serialize member of
    the object passing itself.

    The Serialize member of a ISerialize derived typically (piece of my code)

    void MCursor::MCurso rInfo::Serializ e( MArchive& Archive )
    {
    Archive.Seriali ze ( Event );
    Archive.Seriali ze ( OffsetX );
    Archive.Seriali ze ( OffsetY );
    Archive.Seriali ze ( Animation );
    }


    Note1 that since the archive knows whether it is reading or writing it can
    choose between the read or write function.
    Note2 Because you can overload the read and write function you can create
    derived function to store to memory disks, sockets etc easily. You only have
    to overload the very simplistic read and write function.
    Note3 Because (de)/serialization is done with one function it always is in
    sync (no possibility to make a mismatch between read and write).

    When this works extend your archive to keep track of written pointers to
    ISerailize objects (so it only creates one object when it there are several
    pointers around to one object on deserialization ).
    and add some stuff to automatically save STL vectors/sets/maps etc of
    ISerialize etc

    It might take some time to setup but ones you have it, it really works like
    a charm :-) and loads of fun to see how easy it works then. I use it to save
    my 2D game engines internal state (savegame) and load the same state back in
    memory (deserializatio n).

    Things to study:
    Object factories
    Lookup microsofts way of serialization on MSDN ( I stole the duplicate
    pointer idea from them :-) )
    Make sure your are reasonably aquainted with templates


    Regards, Ron AF Greve



    "William" <Wizumwalt@gmai l.comwrote in message
    news:1177030186 .866727.212210@ d57g2000hsg.goo glegroups.com.. .
    I'm looking for an example that would show how to serialize a c++
    object at it's simplest w/o using any other api's. I have a class that
    I want to serialize and then pass to my obj-c class so I can send it
    over the wire.
    >
    I'm just looking for how to serialize it, then pack it back up on the
    other end.
    >
    Any help much appreciated.
    >

    Comment

    • =?iso-8859-1?q?Kirit_S=E6lensminde?=

      #3
      Re: object serialization

      On Apr 20, 8:24 am, "Ron AF Greve" <ron@localhostw rote:
      void MCursor::MCurso rInfo::Serializ e( MArchive& Archive )
      {
      Archive.Seriali ze ( Event );
      Archive.Seriali ze ( OffsetX );
      Archive.Seriali ze ( OffsetY );
      Archive.Seriali ze ( Animation );
      >
      }
      There is one extra complication that this approach doesn't address. As
      the file format changes you need to be able to read old versions with
      newer software.

      There are a few ways of doing this. The way I've used in the past is
      that each object also writes a schema number which can then be used to
      work out which structure to use. You may get something that looks a
      little more like this:

      void MCursor::MCurso rInfo::Serializ e( MArchive& Archive )
      {
      int version = Archive.Version < MCursorInfo >( 2 );
      if ( version >= 1 ) {
      Archive.Seriali ze( Event );
      Archive.Seriali se( OffsetX );
      Archive.Seriali se( OffsetY );
      }
      if ( version >= 2 )
      Archive.Seriali se( Animation );
      }

      You want to arrange for Version to return the version that is to be
      used. By passing in the type (I've done it via a template
      specialisation, but there are other ways too) it can store a lookup
      for overall file version against each schema part so that you can also
      save to older file formats.


      K

      Comment

      • James Kanze

        #4
        Re: object serialization

        On Apr 20, 2:49 am, William <Wizumw...@gmai l.comwrote:
        I'm looking for an example that would show how to serialize a c++
        object at it's simplest w/o using any other api's. I have a class that
        I want to serialize and then pass to my obj-c class so I can send it
        over the wire.
        I'm just looking for how to serialize it, then pack it back up on the
        other end.
        There are, regretfully, no simple answers. Basically, you'll
        have to either define a line protocol yourself, or use an
        existing one, then code every type you use to conform to the
        line protocol.

        If there are no other particular constraints, I'd start with XDR
        for the low level types, and build on it. I'd probably define a
        oxdrstream and an ixdrstream, with << and >operators for the
        primitive types. (Handling integers is easy. Floating point
        less so; depending on how portable you want to code, it can even
        be very complex.) More complex types then output each field.
        (Some consideration must also be given to variable length types,
        e.g. vectors and strings. XDR has some basic rules for these as
        well.) And don't forget to follow pointers, if the pointed to
        data is logically part of your object. (You cannot, of course,
        serialize a pointer.)

        You'll also have to give some thought to how the receiving end
        will know what type it is getting. Depending on the protocol,
        this may be more or less implicit, but most of the times, there
        will be cases where you'll have to transmit this information as
        well.

        --
        James Kanze (GABI Software) email:james.kan ze@gmail.com
        Conseils en informatique orientée objet/
        Beratung in objektorientier ter Datenverarbeitu ng
        9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34

        Comment

        • Roland Pibinger

          #5
          Re: object serialization

          On 20 Apr 2007 00:52:40 -0700, James Kanze wrote:
          >If there are no other particular constraints, I'd start with XDR
          >for the low level types, and build on it. I'd probably define a
          >oxdrstream and an ixdrstream, with << and >operators for the
          >primitive types.
          Jack W. Reeves developed that kind of XDR library in a series of
          articles (which may be found on the internet).


          --
          Roland Pibinger
          "The best software is simple, elegant, and full of drama" - Grady Booch

          Comment

          • Roland Pibinger

            #6
            Re: object serialization

            On 19 Apr 2007 17:49:46 -0700, William <Wizumwalt@gmai l.comwrote:
            >I'm looking for an example that would show how to serialize a c++
            >object at it's simplest w/o using any other api's. I have a class that
            >I want to serialize and then pass to my obj-c class so I can send it
            >over the wire.
            Read the C++ FAQ-Lite.
            >I'm just looking for how to serialize it, then pack it back up on the
            >other end.
            Why not just write the data in some data format? Serialization seems
            to be the wrong level of abstraction, not only in C++ but also in
            other languages. Convert your data to a readable maybe even
            standardized format and the receivers will be happy.


            --
            Roland Pibinger
            "The best software is simple, elegant, and full of drama" - Grady Booch

            Comment

            • Ron AF Greve

              #7
              Re: object serialization

              Hi Kirit,

              Thanks for the reply and noting I missed something important.

              Actually I have a number assigned to the complete archive. However I don't
              use it anywhere since I couldn't really decide if it was better to have a
              version per object or just change the archive version when one or more
              objects change and then use if( Archive.GetVers ion() ) or something like
              that.

              However looking at your example I realize that it would indeed be better to
              do it on a 'per object' basis.

              Regards, Ron AF Greve



              "Kirit Sælensminde" <kirit.saelensm inde@gmail.comw rote in message
              news:1177043397 .406212.30270@d 57g2000hsg.goog legroups.com...
              On Apr 20, 8:24 am, "Ron AF Greve" <ron@localhostw rote:
              >void MCursor::MCurso rInfo::Serializ e( MArchive& Archive )
              >{
              > Archive.Seriali ze ( Event );
              > Archive.Seriali ze ( OffsetX );
              > Archive.Seriali ze ( OffsetY );
              > Archive.Seriali ze ( Animation );
              >>
              >}
              >
              There is one extra complication that this approach doesn't address. As
              the file format changes you need to be able to read old versions with
              newer software.
              >
              There are a few ways of doing this. The way I've used in the past is
              that each object also writes a schema number which can then be used to
              work out which structure to use. You may get something that looks a
              little more like this:
              >
              void MCursor::MCurso rInfo::Serializ e( MArchive& Archive )
              {
              int version = Archive.Version < MCursorInfo >( 2 );
              if ( version >= 1 ) {
              Archive.Seriali ze( Event );
              Archive.Seriali se( OffsetX );
              Archive.Seriali se( OffsetY );
              }
              if ( version >= 2 )
              Archive.Seriali se( Animation );
              }
              >
              You want to arrange for Version to return the version that is to be
              used. By passing in the type (I've done it via a template
              specialisation, but there are other ways too) it can store a lookup
              for overall file version against each schema part so that you can also
              save to older file formats.
              >
              >
              K
              >

              Comment

              • William

                #8
                Re: object serialization

                Why not just write the data in some data format? Serialization seems
                to be the wrong level of abstraction, not only in C++ but also in
                other languages. Convert your data to a readable maybe even
                standardized format and the receivers will be happy.
                >
                Well, I'm only doing it so I can pass the data from a process to a
                thread I just launched. So it's all sorta in the same name space. It's
                just my libraries are in C++ and I was hoping to serialize the data
                nativly in c++ so that when I copy it into my obj-c objects, i've be
                most of the way done w/ performance benefits.

                I'll see if I can make this happen.

                Comment

                • James Kanze

                  #9
                  Re: object serialization

                  On Apr 21, 2:40 am, William <Wizumw...@gmai l.comwrote:
                  Why not just write the data in some data format? Serialization seems
                  to be the wrong level of abstraction, not only in C++ but also in
                  other languages. Convert your data to a readable maybe even
                  standardized format and the receivers will be happy.
                  Well, I'm only doing it so I can pass the data from a process to a
                  thread I just launched.
                  If the two threads are in the same process, you don't need
                  serialization, since they share common memory. (You will need
                  to ensure synchronized access, however.)
                  So it's all sorta in the same name space. It's
                  just my libraries are in C++ and I was hoping to serialize the data
                  nativly in c++ so that when I copy it into my obj-c objects, i've be
                  most of the way done w/ performance benefits.
                  In fact, you're concerned with communicating between two
                  languages, rather than between two processes or machines.
                  That's a different kettle of fish entirely; depending on the
                  representations used, it can vary from trivial (from C++ to C)
                  to very complicated (C++ to Cobol, perhaps). Serialization is
                  one solution, of course, but it is rarely the simplest or the
                  most efficient. Basically, when going from language A to
                  language B:

                  -- If possible, use data with compatible formats. This is what
                  makes C++ to C work so well; C++ more or less requires a
                  large category of data types to have a format compatible
                  with C, so all you have to do is pass a pointer to it to the
                  C function.

                  -- Failing that, you'll have to convert the data somehow. The
                  conversions can be more or less complicated: when going from
                  C++ to Fortran, for example, most of the basic types (int,
                  float, etc.) will be compatible, so all you have to worry
                  about is the fact that all Fortran parameters are by
                  reference, that Fortran arrays are row major, rather
                  than column major (but that can possibly be handled just by
                  declaring them differently) and strings---Fortran's
                  character type is not generally compatible with either
                  std::string or char[]. If the target is Cobol, on the other
                  hand, you may end up having to convert double's to BCD, and
                  what have you.

                  -- Finally, of course, if you already have serialization
                  routings available in both languages, you can use them. Be
                  aware, however, that what this really means is converting
                  C++ date to a neutral, third format, and then converting
                  this format back to the format in the other language. And
                  that the neutral, third format conforms to a number of
                  constraints which generally make it less efficient to
                  convert to and from than other formats.

                  If the serialization is already present and handy in both
                  languages, or if it is or will be necessary anyway,
                  serialization is certainly an option to be considered.
                  Otherwise, however, it is by far the least efficient, both where
                  it always counts (your time and effort), and in terms of
                  performance.

                  --
                  James Kanze (Gabi Software) email: james.kanze@gma il.com
                  Conseils en informatique orientée objet/
                  Beratung in objektorientier ter Datenverarbeitu ng
                  9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34

                  Comment

                  • =?iso-8859-1?q?Kirit_S=E6lensminde?=

                    #10
                    Re: object serialization

                    On Apr 21, 6:03 am, "Ron AF Greve" <ron@localhostw rote:
                    Hi Kirit,
                    >
                    Thanks for the reply and noting I missed something important.
                    >
                    Actually I have a number assigned to the complete archive. However I don't
                    use it anywhere since I couldn't really decide if it was better to have a
                    version per object or just change the archive version when one or more
                    objects change and then use if( Archive.GetVers ion() ) or something like
                    that.
                    >
                    However looking at your example I realize that it would indeed be better to
                    do it on a 'per object' basis.
                    Top posting is generally frowned upon here.

                    There is a trade-off with the version per object rather than version
                    per file. Clearly it is going to make the archive data larger, but
                    even in '95 when I wrote about serialisation in DDJ <http://
                    www.ddj.com/184409645?pgno= 3the overhead was worth it. These object
                    numbers are especially important when you're in development because
                    the object layouts are going to change rather frequently.

                    There is a way around this. You can arrange for the archive to be
                    versioned and for final release builds you also arrange for
                    Archive.Version <>() to return the archive version number which you
                    make sure is a higher number than the version number you'd got up to
                    for any object. For the development round you then start jump above
                    this number for each object again.

                    Archives saved with development builds will therefore be bigger than
                    those saved with release builds, but they will be readable by the
                    software version that saved them or any later version (there are some
                    complications about object classes that are no longer in use, but the
                    details aren't too hard to work out).

                    As for the DDJ article, the code isn't really worth anything now. It
                    was written when template programming meant using C macros - ouch! The
                    ideas should still be relevant though.


                    K

                    Comment

                    • Ron AF Greve

                      #11
                      Re: object serialization

                      Hi Kirit,

                      "Kirit Sælensminde" <kirit.saelensm inde@gmail.comw rote in message
                      news:1177211418 .091791.322890@ y5g2000hsa.goog legroups.com...
                      On Apr 21, 6:03 am, "Ron AF Greve" <ron@localhostw rote:
                      >Hi Kirit,
                      >>
                      >Thanks for the reply and noting I missed something important.
                      >>
                      >Actually I have a number assigned to the complete archive. However I
                      >don't
                      >use it anywhere since I couldn't really decide if it was better to have a
                      >version per object or just change the archive version when one or more
                      >objects change and then use if( Archive.GetVers ion() ) or something like
                      >that.
                      >>
                      >However looking at your example I realize that it would indeed be better
                      >to
                      >do it on a 'per object' basis.
                      >
                      There is a trade-off with the version per object rather than version
                      per file. Clearly it is going to make the archive data larger, but
                      even in '95 when I wrote about serialisation in DDJ <http://
                      www.ddj.com/184409645?pgno= 3the overhead was worth it. These object
                      numbers are especially important when you're in development because
                      the object layouts are going to change rather frequently.
                      Thanks for the article it is good to see different approaches. Though in my
                      case I like the superclass idea (I have a common superclass anyway since I
                      need it in the rest of the engine). As a note to the OP (in the superclass
                      approach anyway) pointers have to be tested for zero and if so on
                      saving/loading a zero is saved or loaded but (obviously) the, virtual,
                      Serialize member on the zero pointer shouldn't be called.
                      >
                      There is a way around this. You can arrange for the archive to be
                      versioned and for final release builds you also arrange for
                      Archive.Version <>() to return the archive version number which you
                      make sure is a higher number than the version number you'd got up to
                      for any object. For the development round you then start jump above
                      this number for each object again.
                      >
                      Archives saved with development builds will therefore be bigger than
                      those saved with release builds, but they will be readable by the
                      software version that saved them or any later version (there are some
                      complications about object classes that are no longer in use, but the
                      details aren't too hard to work out).
                      >
                      Ok, good idea, that would be best of both worlds..
                      As for the DDJ article, the code isn't really worth anything now. It
                      was written when template programming meant using C macros - ouch! The
                      ideas should still be relevant though.
                      >
                      >
                      K
                      >

                      Regards, Ron AF Greve




                      Comment

                      • William

                        #12
                        Re: object serialization

                        If the two threads are in the same process, you don't need
                        serialization, since they share common memory. (You will need
                        to ensure synchronized access, however.)
                        >
                        That's one possibility, but since the user may modify these objects
                        while my thread is processing the copies, it really makes since to do
                        it this way.
                        In fact, you're concerned with communicating between two
                        languages, rather than between two processes or machines.
                        That's a different kettle of fish entirely; depending on the
                        representations used, it can vary from trivial (from C++ to C)
                        to very complicated (C++ to Cobol, perhaps). Serialization is
                        one solution, of course, but it is rarely the simplest or the
                        most efficient. Basically, when going from language A to
                        language B:
                        >
                        -- If possible, use data with compatible formats. This is what
                        makes C++ to C work so well; C++ more or less requires a
                        large category of data types to have a format compatible
                        with C, so all you have to do is pass a pointer to it to the
                        C function.
                        I'm going from C++ to obj-c, and the pointer would work except for the
                        reasons I state above.
                        >
                        -- Failing that, you'll have to convert the data somehow. The
                        conversions can be more or less complicated: when going from
                        C++ to Fortran, for example, most of the basic types (int,
                        float, etc.) will be compatible, so all you have to worry
                        about is the fact that all Fortran parameters are by
                        reference, that Fortran arrays are row major, rather
                        than column major (but that can possibly be handled just by
                        declaring them differently) and strings---Fortran's
                        character type is not generally compatible with either
                        std::string or char[]. If the target is Cobol, on the other
                        hand, you may end up having to convert double's to BCD, and
                        what have you.
                        Conversions should be easy from C++ to obj-c.
                        >
                        -- Finally, of course, if you already have serialization
                        routings available in both languages, you can use them. Be
                        aware, however, that what this really means is converting
                        C++ date to a neutral, third format, and then converting
                        this format back to the format in the other language. And
                        that the neutral, third format conforms to a number of
                        constraints which generally make it less efficient to
                        convert to and from than other formats.
                        >
                        If the serialization is already present and handy in both
                        languages, or if it is or will be necessary anyway,
                        serialization is certainly an option to be considered.
                        Otherwise, however, it is by far the least efficient, both where
                        it always counts (your time and effort), and in terms of
                        performance.
                        Understood.

                        Comment

                        Working...