Making an iterator for a custom container class

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • arnaudk
    Contributor
    • Sep 2007
    • 425

    Making an iterator for a custom container class

    I'm trying to design a simple container class for some data of different types based on a vector of structs, but the vector and struct are protected so that the implemenation of my container class can change independently of the interface.
    [CODE=cpp]
    using namespace std;

    class MyContainer {
    public:
    // public stuff

    protected:
    struct Item {
    float dataFlt;
    int dataInt;
    string dataStr;
    } element;

    vector<Item> storage;
    };
    [/CODE]

    Now I want to provide an iterator for the items in MyContainer. One rather silly way would be to add to my class
    [CODE=cpp]
    class MyContainer {
    ...
    public:
    vector<Item>::i terator itr;
    vector<Item>::i terator begin() { return storage.begin() }
    vector<Item>::i terator end() { return storage.end()}
    }

    // to use iterator:
    int main() {
    MyContainer stuff;
    // add some items

    for(stuff.itr = stuff.begin(); itr < stuff.end(); itr++) {
    //process items
    }
    return 0;
    }
    [/CODE]

    I'm sure you're cringing at this point. I'd really like to do it in an STL-compliant way, something like defining stuff::iterator itr in main() and using that iterator. It seems I need do something like
    [CODE=cpp]
    #include <iterator>

    class MyIterator :
    public MyContainer,
    public std::iterator <std::bidirecti onal_iterator_t ag, type> {
    // overload the various operators
    };
    [/CODE]
    but I'm not sure how do use this, in particular since 'type' should be something like vector<Item> but of course the Item struct is not defined in the global scope. Any tips would be greatly appreciated!

    Thanks in advance,
    Arnaud
  • weaknessforcats
    Recognized Expert Expert
    • Mar 2007
    • 9214

    #2
    Why is the struct declared in the container??

    I would have thought you would have created objects of the struct and added them to the container.

    By having ther strucr protected, access to it is restriceted to derived classes. Howver, these derived classes are not kinds of containers.

    Yout class model appears incorrect.

    Comment

    • arnaudk
      Contributor
      • Sep 2007
      • 425

      #3
      Originally posted by weaknessforcats
      Why is the struct declared in the container??
      There is no requirement a priori for the data to be stored in a struct. It may well be stored in one vector for each data type, or something else entirely; my solution should be implementation-independent. All we need to know is that we have a record of various known data types to be stored in some container and iterated over later. That's why I thought it best to make the struct and vector protected, accessible only to an (implementation-dependent) iterator subclass. Do you think that's going about it the wrong way? To my naive eye, it seems like that's sort of what happens in the STL vector library while things are kept very general with templates.

      Cheers,
      Arnaud

      Comment

      • weaknessforcats
        Recognized Expert Expert
        • Mar 2007
        • 9214

        #4
        Originally posted by arnaudk
        Do you think that's going about it the wrong way?
        Yes I do. It appears you want a vector<VARIANT> where VARIANT is a Microsoft discriminated union used to pass data between C/C++ and Visual Basic. It is a C structured solution for an obect-oriented problem.

        Your data model appears to need to be:

        Container

        IntContainer: public Container

        DoubleContainer : public Container:

        etc..

        You use public functions in Container which call private virtual functions in the Container. These private virtual functions are the ones overridden by the derived containers. This separates the Container interface (the public functions) from the Container implementation (the Container private virtual functions. This is a design pattern called the Template Method.

        Most operations can be performed using the Container interface. In cases where the derived container has special behaviors not available in the Container base class, you use a Visitor design pattern to execute thoise methods. There is an article on Visitor in the C/C++ articles forum.

        If you go this way you can add new containers of new types without impacting currentlt compiled code.

        See the book Design Patterns by Erich Gamma, et al., Addision-Wesley 1994.

        Comment

        • arnaudk
          Contributor
          • Sep 2007
          • 425

          #5
          Thanks a million for your comments. And congratulations for reaching 2000 posts!

          Call me stubborn, but I'd prefer to have one class which holds all the data rather than using a subclass for each type, I don't understand why that's a bad idea. To put the question differently, is it possible to write a class VectorWrapper which is just a wrapper for the vector<int> container so that I can write

          [CODE=cpp]
          VectorWrapper object;
          VectorWrapper:: iterator itr;

          // add data to object

          for (itr = object.begin(); itr < object.end(); itr++ ) {
          // do something
          }
          [/CODE]

          In the meantime, thanks also for the book reference. A good book is hard to find so I'll get hold of a copy if you recommend it.

          Regards,
          Arnaud

          Comment

          • RRick
            Recognized Expert Contributor
            • Feb 2007
            • 463

            #6
            You can create a vector with a structure and mimic the various functionality of a vector/map/whatever. If you do that, then realize what you're getting into.

            First, the STL is based on templates which means compile time type checking. You're structure is throwing that away.

            As for your structure:

            What about memory usuage? Now every int has a sizeof( int + double + string + whatever). This could be expensive, but unions can limit the total size a bit.

            How do you know which value is set in the structure? Now you need a type value to say who is who (oops, the structure just got bigger). When you access the structure, you have go through a series of if - then - else statements for who is who. This is definitely not OO.

            In the OO world, we want to use the base/derived class mechanism and stay away from if - then - else situations. W4cats showed you one example where the container fits this model. Another possibility, is to make the containee the base/derived class. The problem with that approach is that pointers must now be used in the container. Your container must know how to deal with the resources.

            Sometimes, simple ideas can have large ramifications.

            Comment

            • arnaudk
              Contributor
              • Sep 2007
              • 425

              #7
              Ah, thanks Rick, I think I see where I misunderstood W4cats.

              The thing is, I'm thinking of a data "packet" as consisting of an int, a double and a string. My only reservation with the inheritance model that it forces me to split my packet over several seemingly disconnected objects:

              [CODE=cpp]
              // Declarations & definitions
              class Container
              class IntContainer: public Container
              class DoubleContainer : public Container

              int main() {
              IntContainer intdata;
              DoubleContainer doubledata;
              Container::iter ator itr;
              }
              [/CODE]

              now the ith entries of intdata and doubledata are actually part of the same data packet i. I can also in principle have more intdata than doubledata which makes no sense. Plus, I liked the idea that dereferencing the iterator (*itr) yields one data entry (namely, the struct in my previous example). Here, it's not clear what it should be, nor that I can have a single universal iterator for all the subclasses.

              It seems to me that the encapsulation is better reflected by a single class model. Of course, that class could be a handle class to data packet objects, but then I'm just effectively replacing the struct by a class of my own design and I bet I'll have to debug a million memory leaks.

              Surely this must be a very common problem: storing records consisting of different types within a single container and with a single universal iterator or key. It's the simplest database imaginable. Is my single class model still misguided?

              Comment

              • weaknessforcats
                Recognized Expert Expert
                • Mar 2007
                • 9214

                #8
                Originally posted by arnaudk
                The thing is, I'm thinking of a data "packet" as consisting of an int, a double and a string. My only reservation with the inheritance model that it forces me to split my packet over several seemingly disconnected objects:
                I don't see why this is true. You have a Packet base class and derived classes for the different kinds of packet data. You handle everything with functions using Packet* or Packet&.

                Your container now becomes a container of Packet* or Packet&. Or, even better, a container of handles to packets.

                Your packet is not split. Every packet is a derived object being used with a base pointer or reference. In short, fundamental polymorphism.

                By using in inheritance approach you can packets of new types later on without redesigning your code.

                All you need is:

                Packet
                IntPacket : public Packet
                doublePacket : public Packet

                vector<Handle<P acket> > theContainer; //read the Handle Class article!

                Ten years from you you code:

                DatePacket : public Packet

                and you can add Dates to the vector without needing to recompile any of the code using Packet. You just create DatePacket objects and add them to the vector as Handle<Packet>.

                Comment

                • RRick
                  Recognized Expert Contributor
                  • Feb 2007
                  • 463

                  #9
                  You're single class idea is more than just a container. It is a combination of container and structure. It is best described as MyItemContainer instead of MyContainer. I'm not sure why you want to combine the two together, because in the STL world, containers and data are usually kept separate.

                  What if you need to expand your example to include items with more things? There is no easy way to do this. You can't put ItemDerived objects in vector< Item>. It's not going to work.

                  That's why I like the containee design descibed in post #8. This allows any and all, now or later, derived classes of Packet to be stored. The downside is that knowledge of all classes is limited by the base class.

                  If you're not familar with the Handle concept, don't worry about it. Handle is a powerful C++ mechanism to deal with multiple resource access and resource deallocation; that's all (but that's a lot!).

                  Comment

                  • arnaudk
                    Contributor
                    • Sep 2007
                    • 425

                    #10
                    Thank you once again for your replies, it is very much appreciated. What you both say seems to make sense, I think I probably haven't understood polymorphism properly so I'll study that again before asking more questions.

                    Comment

                    • weaknessforcats
                      Recognized Expert Expert
                      • Mar 2007
                      • 9214

                      #11
                      Originally posted by RRick
                      What if you need to expand your example to include items with more things? There is no easy way to do this. You can't put ItemDerived objects in vector< Item>. It's not going to work.
                      This is true. That's why I said to have a containner of handles. The idea is the objects need not go in the container. All that needs be in the container is the address of the derived object seen as a base class pointer. (I suggest a Handle to avoid premature destructor calls).

                      That way you add new derived types just by adding objects of the new types as base class pointers.

                      If you design correctly, you never need the derived class directly. All of the derived methods can be private. These private derived methods override corresponding private virtual functions in the base class. The bas class public methods call the bas class privzte methods to acquire the derived class behavior.

                      Comment

                      • arnaudk
                        Contributor
                        • Sep 2007
                        • 425

                        #12
                        OK, but ultimately, I still don't fully appreciate the difference between a container of handles to structs or a container of handles to Packet objects as was suggested above in your model using inheritance. Now, I can do things quick and dirty using my childish hybrid C single object model but I'd like to do them properly following the OO design philosophy so I'm trying to understand how to implement your suggestions.

                        I'm reading up on it but it would help me a lot if you could show me how I can create an instance of Packet with its payload of int, double and string because I can't see how that can all be done passing a single address representing the packet to the handle container.

                        So far, you've suggested

                        Packet
                        IntPacket : public Packet
                        DoublePacket : public Packet
                        StringPacket : public Packet

                        vector<Handle<P acket> > theContainer;

                        Now, in main(), I presume I am to start instances like so

                        IntPacket intdata;
                        DoublePacket doubledata;
                        StringPacket stringdata

                        then add data to those objects. Then you say "Every packet is a derived object being used with a base pointer or reference." Could you show me this pointer/reference? I can't see any object which I can push_back(&obje ct) into theContainer that represents the data packet as a whole the way my struct does, I only see three separate objects with three separate addresses: intdata, doubledata and stringdata. I can make three handle containers but that doesn't look like what you mean.

                        Now, suppose I've successfully represented each packet by a single address which I've stored in theContainer as you allude to. How do I subsequently access intdata, doubledata, etc. given an entry in theContainer?

                        Please say so if you think I've just misunderstood things completely. I'm sure things will clear up after I've done some more reading.

                        Comment

                        • weaknessforcats
                          Recognized Expert Expert
                          • Mar 2007
                          • 9214

                          #13
                          Let's expand your code:

                          [code=cpp]
                          vector<Handle<P acket> > database;

                          Handle<Packet> v1 = CreateIntPacket (25);
                          Handle<Packet> v2 = CreateDoublePac ket(3.14159);
                          Handle<Packet> v3 = CreateStringPac ket("Hello there");

                          //Add handles to the vector:
                          database.push_b ack(v1);
                          database.push_b ack(v2);
                          database.push_b ack(v3);
                          [/code]

                          Now you have a vector of Packet handles.

                          The create function (please read the Handle Class article in the C/C++ Articles forum as I repeat it here) looks like this:

                          [code=cpp]
                          Handle <Packet> CreateIntPacket (int value)
                          {
                          Packet* temp = new IntPacket(value );
                          Handle<Packet> handle(temp);
                          return temp;
                          {
                          Handle <Packet> CreateDoublePac ket(doublevalue )
                          {
                          Packet* temp = new DoublePacket(va lue);
                          Handle<Packet> handle(temp);
                          return temp;
                          {
                          Handle <Packet> CreateStringtPa cket(string value)
                          {
                          Packet* temp = new StringPacket(va lue);
                          Handle<Packet> handle(temp);
                          return temp;
                          {
                          [/code]

                          Notice there is no use of the stack. It is important that you never create stack variables and pass their addresses around. You need to control when the objects are deleted rather than have the compiler do it. The objects cannot be deleted until there are no pointers to it leftr in the program. This is why you use a handle (a.k.a. a smart pointer).

                          From here on, all of your application functions should take Handle<Packet> arguments. You never use Handle<Packet>* or Handle<Packet>& arguments. Not ever. Not ever.

                          Next year you can create a DatePacket type, write a
                          CreateDatePacke t(int m, int d, int y)
                          function and add DatePackets to your vector without needing to recompile any of the application code.

                          As you design the Packet, I suggest you design for the Visitor design pattern (see the Visitor article inthe C/C++ Articles forum). That way you can call derived class methods that aren't in the base class.

                          Comment

                          • arnaudk
                            Contributor
                            • Sep 2007
                            • 425

                            #14
                            Perhaps I'm wrong but it seams we have a different understanding of 'data packet'. As I see it, iterating through the object you call database will yield an int followed by a double followed by a string rather than a single object encapsulating this data:
                            [CODE=cpp]vector<Handle<P acket> >::iterator itr;

                            for(itr=databas e.begin();itr<d atabase.end();i tr++) {
                            // first time: int 25
                            // second time: double 3.1415
                            // third time: string "Hello there"
                            // end of for loop.
                            }
                            [/CODE]
                            Is that what you intended? The vector of handles to the intdata, doubledata, etc. is what I call a packet. In fact, for my real database I'd need to construct a vector of handles to the vectors of handles of data (the latter of which you have called 'database' in your example). Is that correct?

                            Comment

                            • RRick
                              Recognized Expert Contributor
                              • Feb 2007
                              • 463

                              #15
                              There's nothing to stop you from making the packet as simple or complex as you need it. For example:
                              [code=cpp]
                              class ComplexPacket: public Packet
                              {
                              public:
                              ComplexPacket( int ival, dval, sval): inum_( ival), dnum_(dval), str_( sval);
                              protected: // or private:
                              int inum_;
                              double dnum_;
                              string str_;
                              };[/code]

                              Comment

                              Working...