Advanced topics... optimising / C++ / OOP / RefCounting / type-safety

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • st_ev_fe@hotmail.com

    Advanced topics... optimising / C++ / OOP / RefCounting / type-safety

    Hi people,

    I've been doing C for about 7 years now. But I'm new to C++.

    I've decided that C++'s operator overloading could be very handy. I'm
    writing something much like auto_ptr, except for my own set of classes.
    Basically, it's a class that is supposed to be allocated automatically
    on the stack, and possesess one member which is a pointer to an object.

    Unlike auto_ptr, my object will actually call a lock or unlock function
    upon the object. Each object has a "long RefCount" member, and a
    "MyClass*" which is a pointer to some type information. The "class"
    pointed to is just a statically allocated struct that contains some
    nice info about each instance of itself, such as the size, the
    constructor and destructor functions.

    Basically I've written a ref-counting memory management system in C++.

    I think anyone familiar with the STL string class shouldn't be too
    worried about this approach, because most "string" implementations
    actually use refcounts internally, anyhow.

    And this is my first venture into properly using C++. So by default,
    I'll be getting things wrong :o)

    Well the code is correct. It compiles, and runs as I expect it to (took
    some debugging but it's there now).

    So what's my question?

    Basically, is there a faster / smaller / neater way to do what I'm
    doing? I've tested my code when setting the compiler to generate small
    size code (for Mac PPC), and... well the refcounting system seems to
    use A LOT of code just to do something very simple like:

    Ref1 = Ref2;

    Just that takes up about 52 bytes of in my compiled PPC executable!
    Normally assigning a pointer takes 4 bytes, (that's 1 PPC instruction).

    Well, obviously that's quite bad right?

    The code executing is htis:

    class OPObject {

    OPStruct* Value;

    OPObject& operator=(OPObj ect& obj) {
    OPUnlock(Value) ;
    Value = ref.Value;
    OPLock(Value);
    return *this;
    }


    This is actually an inline function. This is my first question, is it
    even worth inlining this function?? Especially when I'm using two
    inlined functions, OPLock and OPUnlock?

    OPLock goes like this:

    inline void OPLock( OPStruct* obj ) {
    if (obj) {
    obj->RefCount++;
    }
    }


    inline void OPUnlock( OPStruct* obj ) {
    if (obj) {
    long rc = obj->RefCount - 1;
    obj->RefCount = rc;
    if (!rc) {
    OPDeAlloc( obj );
    }
    }
    }

    Doesn't seem like much code, but it adds up...

    It's a lot just to do "a = b;", really.

    Of course, I don't necessarily need to use my OPObjects everywhere...
    OPObject is just for memory management really. I can just extract the
    OPStruct* and pass that around, knowing that it's a bare pointer that
    could be invalid if stored outside of the lifetime of the OPObject.

    Would I get faster code if I uninlined some stuff??

    What about altering my code? Does anyone know a faster way to do
    refcounting? Any OOP experts out there that know a faster way to do
    this?? Any OOP experts out there that know how other refcounting
    languages do their refcounting? Maybe if someone knows how Java or perl
    or VisualBasic or other languages do it....??

    The "if (obj)" thing irks me because I just wonder if there is a way to
    do without it... but it seems like I must have it, because it's quite
    possible to have an OPStruct* to 0, so I need to be able to handle
    that. Unless someone can think of a clever way to avoid having to deal
    with that? (One that doesn't involve me recoding my entire library
    which currently relies on comparing object pointers to 0 to see if they
    exist or not.)

    Also, this brings me back to the question of neatness and simplicity.

    I just don't know really how to make this into a compile-time type
    checking safe system.

    Right now I have a base class, OPStruct. All my other classes subclass
    from this, so I have:

    struct MySpecialClass : public OPStruct { /*...*/ };
    struct MyOtherClass : public OPStruct { /*...*/ };

    etc etc

    Now, OPObject (with it's "OPStruct* Value"), doesn't actually specify
    any type other than OPStruct.

    OPObject has a handy converter to and from OPStruct*, so there is no
    problem passing and returning the two types between each other.

    So that means I can stuff a "MySpecialClass " into OPObject...

    And then call this function MySpecialFuncti on(OPStruct* obj); like so

    MySpecialFuncti on( MyOPObject ); // OPObject's operator convert to
    OPStruct* is called.

    because I've not specified any types in MySpecialFuncti on's params. if
    I DID specify a type in MySpecialFuncti on's param, like so:

    MySpecialFuncti on( MySpecialClass* obj );

    then I can't call this:

    MySpecialFuncti on( MyOPObject );

    then I'll get a type error, because OPStruct* isn't necessarily a
    MySpecialClass* !!

    So what do I do?

    I want compile-time type checking with a small simple way to manage
    this. Can I do it with some special define trickery that lets me define
    OPStruct* subclasses with their corresponding OPObject subclassing
    owners?

    Or is there a better way using templates? I don't want any overhead,
    this must strictly be a compile-time type check remember.

    Sorry for taking so long to explain myself but I think if I used less
    works I wouldn't have stated the situation fully :o)

    Thanks in advance to anyone smart enough and helpful enough to reply!

  • niklasb@microsoft.com

    #2
    Re: Advanced topics... optimising / C++ / OOP / RefCounting / type-safety

    st_ev_fe@hotmai l.com wrote:[color=blue]
    > Hi people,
    >
    > I've been doing C for about 7 years now. But I'm new to C++.
    >
    > I've decided that C++'s operator overloading could be very handy. I'm
    > writing something much like auto_ptr, except for my own set of classes.
    > Basically, it's a class that is supposed to be allocated automatically
    > on the stack, and possesess one member which is a pointer to an object.[/color]

    Implementing a smart pointer type is a fairly advanced project
    for someone new to C++. It's true that std::auto_ptr is pretty
    limited, but you might be better of using some other existing
    implementation such as boost::shared_p tr rather than rolling
    your own.

    That said, I imagine you're really doing this as an exercise
    and are having too much fun to stop. If so, it's still worth
    looking at some existing implemetnations such as boost::shared_p tr
    and boost::intrusiv e_ptr. The latter is pretty close to what
    you're trying to do -- i.e., a reference-counting smart pointer
    in which the ref count is stored as a member of the pointee
    (hence "intrusive" ).
    [color=blue]
    > Unlike auto_ptr, my object will actually call a lock or unlock function
    > upon the object. Each object has a "long RefCount" member, and a
    > "MyClass*" which is a pointer to some type information. The "class"
    > pointed to is just a statically allocated struct that contains some
    > nice info about each instance of itself, such as the size, the
    > constructor and destructor functions.[/color]

    It seems like you're manually reinvent stuff that C++ can do
    for you. For example, if the exact type of the pointee is known
    then just delete it using delete. If the actual type of the pointee
    could be a derived type then you can still use delete as long as
    you declare the base type with a virtual destructor. No need to
    maintain your own function pointer to a destructor function --
    C++ does that for you.
    [color=blue]
    > class OPObject {
    >
    > OPStruct* Value;
    >
    > OPObject& operator=(OPObj ect& obj) {
    > OPUnlock(Value) ;
    > Value = ref.Value;
    > OPLock(Value);
    > return *this;
    > }[/color]

    The operand to operator= should generally be a const reference. It's
    not in the case of auto_ptr but auto_ptr has really funky assignment
    semantics.

    Also, you should increment (lock) before you decrement (unlock) so you
    don't mistakenly free the pointee in the case of self-assignment.

    OPObject& operator=(OPObj ect const& obj)
    {
    OPLock(obj.Valu e);
    OPUnlock(Value) ;
    Value = obj.Value;
    return *this;
    }

    I would suggest more descriptive names -- e.g., OPObject is a
    pointer-like object that manages the lifetime of its pointee
    using reference counting, so why not "CountedPtr " or something.
    But I'll use your names here.
    [color=blue]
    > This is actually an inline function. This is my first question, is it
    > even worth inlining this function?? Especially when I'm using two
    > inlined functions, OPLock and OPUnlock?[/color]

    You could try it both ways and see what the actual impact is on code
    size. Be aware that inline is only a hint to the compiler anyway; it
    may choose not to inline the function.
    [color=blue]
    > inline void OPLock( OPStruct* obj ) {
    > if (obj) {
    > obj->RefCount++;
    > }
    > }
    >
    > inline void OPUnlock( OPStruct* obj ) {
    > if (obj) {
    > long rc = obj->RefCount - 1;
    > obj->RefCount = rc;
    > if (!rc) {
    > OPDeAlloc( obj );
    > }
    > }
    > }[/color]

    These seem like part of the internal implementation of your smart
    pointer class so make them private (and optionally static) member
    functions if they aren't already. If they're member functions then
    you don't need to add prefixes to the names -- just call them Lock
    and Unlock.
    [color=blue]
    > Doesn't seem like much code, but it adds up...
    >
    > It's a lot just to do "a = b;", really.
    >
    > Of course, I don't necessarily need to use my OPObjects everywhere...
    > OPObject is just for memory management really. I can just extract the
    > OPStruct* and pass that around, knowing that it's a bare pointer that
    > could be invalid if stored outside of the lifetime of the OPObject.[/color]

    Or invoke your smart pointer's operator* to obtain a reference and pass
    that around. So whenever a function takes an OPStruct& parameter (or
    OPStruct const&) it's clear that there's now change in "ownership"
    implied, and it's also very cheap.

    If you want to do something that actually imlples ownership, such as
    adding the pointee to a container, then you'd typically pass the smart
    pointer by const-reference.
    [color=blue]
    > The "if (obj)" thing irks me because I just wonder if there is a way to
    > do without it... but it seems like I must have it, because it's quite
    > possible to have an OPStruct* to 0...[/color]

    Well, you own the smart pointer class so you can control its
    invariants. That's what constructors and "private" are for. You
    could design things such that your internal pointer is never NULL,
    and represent the concept "pointer to nothing" by pointing to a
    special instance, thus.

    class OPObject
    {
    public:
    // default-initialize to point to &NullValue
    OPObject(OPStru ct* value = &NullValue);

    private:
    Value* Value;
    static OPStruct NullValue;
    };

    Your Lock/Unlock would then no longer need null checks, but on the
    other hand operator-> would probably need to do a check so as to
    return null if Value==&NullVal ue, so whether it's a net win is up
    to you to decide.
    [color=blue]
    > I just don't know really how to make this into a compile-time type
    > checking safe system.[/color]

    As you guessed, the answer is templates. Again, you might want to look
    at how boost::intrusiv e_ptr does this.

    Comment

    • twoeyedhuman1111

      #3
      Re: Advanced topics... optimising / C++ / OOP / RefCounting / type-safety

      Fricken old people and their c stuff. Goodness.... You need to
      upgrade EVERYTHING!!!!! !!!!!!!!!!!!!!! !!!!!!!!!!!!!!! !!!!!111

      Comment

      • st_ev_fe@hotmail.com

        #4
        Re: Advanced topics... optimising / C++ / OOP / RefCounting / type-safety

        Hi Nik,

        Thanks for the good answers. I've since corrected the lock order bug
        you spotted in my code.

        There are some more considerations that I should have said, so I'll say
        them not.

        My code is for a library that I hope to be usable from C and C++, hence
        the tendancy to use structs instead of classes and using ordinary C
        functions.

        The C++ OPObject that did the refcounting automatically, is just a
        syntactic sugar, not a functional requirement. I'd like it to work in
        pure C. Not because I like C, but because it is so useful for library
        development. A C API can be accessed from many languages or
        environments that a C++ API might not.

        By making the C++ automatic refcounting class be a superficial (but
        very well crafted and incredibly handy) addition ontop of the C layer,
        I'll allow for my compiled library to be executable from languages that
        can access C APIs but not C++ APIs. Hence all the prefixes, like OPLock
        instead of Lock.

        Boost looks very useful actually. If I were using a pure C++
        environment and not caring about general purpose library development,
        I'd look into it. The shared_ptr and intrusive_ptr look pretty much
        exactly what I'm doing...

        I'd still be concerned about Boost's size. As it is I've managed to
        implement a manual refcounting system that has constructors and
        destructors, in just one small .cpp file (with an "Extern "C" API), and
        an optional auto refcounting C++ class. Boost looks huge and scary :o)
        My "ObjectPlatform " as I call it (hence the OP) is small and familiar.

        I see that boost's shared_ptr is recommended for inclusion into the new
        C++ standard, which makes sense as refcounting really is much harder to
        cause errors such as leaks or dangling pointers, for the developer.

        As for the templates for compile time type checking... well I've never
        done templates before but I have no doubt I'll learn to like them if I
        give it a shot. My only concern is size.

        If I template a function that does exactly the same thing on classes of
        different types, will I get different functions compiled? All I'm
        looking for is type safety, remember. Same code, same run-time
        requirements, different compile-time requirements.

        One the one hand I expect this code to be compiled most often with
        modern compilers like gcc. On the other hand even I still use a decade
        old C++ compiler like Apple's old MrCpp. Why MrCpp? Because it creates
        smaller faster code than CodeWarrior does. And CodeWarrior compiles
        smaller faster code than gcc does. So MrCpp is the best in my
        experience.

        Basically I worry if enough compilers will be smart enough to use the
        same code if templated.

        But then it doesn't matter if I am inlining all my C++ functions
        anyhow, right? As long as the OPLock and OPUnlock functions are
        actually not inline, then I won't be getting too much code duplicated.

        I can't really use a C++ class and use C++'s constructor/destructor,
        because I need this library to be compatible across different OOP
        execution environments. It will run in pure C++ environment, some
        basics, and maybe even Java. So I need to have my own class system that
        my ObjectPlatform can massage into a shape acceptable by the hosting
        OOP environment.

        The idea of using a private value which is never null... that seems
        like a good idea. I'll try it out, see if the speed/size difference is
        even noticable. Coders tend to worry about the wrong things. Small
        details get blown out of proportion and big troubles pass under the
        radar :o) I might have been worrying about nothing with the "if (obj)"
        thing.

        Thanks for all the good answers :o)

        niklasb@microso ft.com wrote:[color=blue]
        > st_ev_fe@hotmai l.com wrote:[color=green]
        > > Hi people,
        > >
        > > I've been doing C for about 7 years now. But I'm new to C++.
        > >
        > > I've decided that C++'s operator overloading could be very handy. I'm
        > > writing something much like auto_ptr, except for my own set of classes.
        > > Basically, it's a class that is supposed to be allocated automatically
        > > on the stack, and possesess one member which is a pointer to an object.[/color]
        >
        > Implementing a smart pointer type is a fairly advanced project
        > for someone new to C++. It's true that std::auto_ptr is pretty
        > limited, but you might be better of using some other existing
        > implementation such as boost::shared_p tr rather than rolling
        > your own.
        >
        > That said, I imagine you're really doing this as an exercise
        > and are having too much fun to stop. If so, it's still worth
        > looking at some existing implemetnations such as boost::shared_p tr
        > and boost::intrusiv e_ptr. The latter is pretty close to what
        > you're trying to do -- i.e., a reference-counting smart pointer
        > in which the ref count is stored as a member of the pointee
        > (hence "intrusive" ).
        >[color=green]
        > > Unlike auto_ptr, my object will actually call a lock or unlock function
        > > upon the object. Each object has a "long RefCount" member, and a
        > > "MyClass*" which is a pointer to some type information. The "class"
        > > pointed to is just a statically allocated struct that contains some
        > > nice info about each instance of itself, such as the size, the
        > > constructor and destructor functions.[/color]
        >
        > It seems like you're manually reinvent stuff that C++ can do
        > for you. For example, if the exact type of the pointee is known
        > then just delete it using delete. If the actual type of the pointee
        > could be a derived type then you can still use delete as long as
        > you declare the base type with a virtual destructor. No need to
        > maintain your own function pointer to a destructor function --
        > C++ does that for you.
        >[color=green]
        > > class OPObject {
        > >
        > > OPStruct* Value;
        > >
        > > OPObject& operator=(OPObj ect& obj) {
        > > OPUnlock(Value) ;
        > > Value = ref.Value;
        > > OPLock(Value);
        > > return *this;
        > > }[/color]
        >
        > The operand to operator= should generally be a const reference. It's
        > not in the case of auto_ptr but auto_ptr has really funky assignment
        > semantics.
        >
        > Also, you should increment (lock) before you decrement (unlock) so you
        > don't mistakenly free the pointee in the case of self-assignment.
        >
        > OPObject& operator=(OPObj ect const& obj)
        > {
        > OPLock(obj.Valu e);
        > OPUnlock(Value) ;
        > Value = obj.Value;
        > return *this;
        > }
        >
        > I would suggest more descriptive names -- e.g., OPObject is a
        > pointer-like object that manages the lifetime of its pointee
        > using reference counting, so why not "CountedPtr " or something.
        > But I'll use your names here.
        >[color=green]
        > > This is actually an inline function. This is my first question, is it
        > > even worth inlining this function?? Especially when I'm using two
        > > inlined functions, OPLock and OPUnlock?[/color]
        >
        > You could try it both ways and see what the actual impact is on code
        > size. Be aware that inline is only a hint to the compiler anyway; it
        > may choose not to inline the function.
        >[color=green]
        > > inline void OPLock( OPStruct* obj ) {
        > > if (obj) {
        > > obj->RefCount++;
        > > }
        > > }
        > >
        > > inline void OPUnlock( OPStruct* obj ) {
        > > if (obj) {
        > > long rc = obj->RefCount - 1;
        > > obj->RefCount = rc;
        > > if (!rc) {
        > > OPDeAlloc( obj );
        > > }
        > > }
        > > }[/color]
        >
        > These seem like part of the internal implementation of your smart
        > pointer class so make them private (and optionally static) member
        > functions if they aren't already. If they're member functions then
        > you don't need to add prefixes to the names -- just call them Lock
        > and Unlock.
        >[color=green]
        > > Doesn't seem like much code, but it adds up...
        > >
        > > It's a lot just to do "a = b;", really.
        > >
        > > Of course, I don't necessarily need to use my OPObjects everywhere...
        > > OPObject is just for memory management really. I can just extract the
        > > OPStruct* and pass that around, knowing that it's a bare pointer that
        > > could be invalid if stored outside of the lifetime of the OPObject.[/color]
        >
        > Or invoke your smart pointer's operator* to obtain a reference and pass
        > that around. So whenever a function takes an OPStruct& parameter (or
        > OPStruct const&) it's clear that there's now change in "ownership"
        > implied, and it's also very cheap.
        >
        > If you want to do something that actually imlples ownership, such as
        > adding the pointee to a container, then you'd typically pass the smart
        > pointer by const-reference.
        >[color=green]
        > > The "if (obj)" thing irks me because I just wonder if there is a way to
        > > do without it... but it seems like I must have it, because it's quite
        > > possible to have an OPStruct* to 0...[/color]
        >
        > Well, you own the smart pointer class so you can control its
        > invariants. That's what constructors and "private" are for. You
        > could design things such that your internal pointer is never NULL,
        > and represent the concept "pointer to nothing" by pointing to a
        > special instance, thus.
        >
        > class OPObject
        > {
        > public:
        > // default-initialize to point to &NullValue
        > OPObject(OPStru ct* value = &NullValue);
        >
        > private:
        > Value* Value;
        > static OPStruct NullValue;
        > };
        >
        > Your Lock/Unlock would then no longer need null checks, but on the
        > other hand operator-> would probably need to do a check so as to
        > return null if Value==&NullVal ue, so whether it's a net win is up
        > to you to decide.
        >[color=green]
        > > I just don't know really how to make this into a compile-time type
        > > checking safe system.[/color]
        >
        > As you guessed, the answer is templates. Again, you might want to look
        > at how boost::intrusiv e_ptr does this.[/color]

        Comment

        • Neil Cerutti

          #5
          Re: Advanced topics... optimising / C++ / OOP / RefCounting / type-safety

          On 2005-11-15, st_ev_fe@hotmai l.com <st_ev_fe@hotma il.com> wrote:[color=blue]
          > As for the templates for compile time type checking... well
          > I've never done templates before but I have no doubt I'll learn
          > to like them if I give it a shot. My only concern is size.
          >
          > If I template a function that does exactly the same thing on
          > classes of different types, will I get different functions
          > compiled? All I'm looking for is type safety, remember. Same
          > code, same run-time requirements, different compile-time
          > requirements.
          >
          > One the one hand I expect this code to be compiled most often
          > with modern compilers like gcc. On the other hand even I still
          > use a decade old C++ compiler like Apple's old MrCpp. Why
          > MrCpp? Because it creates smaller faster code than CodeWarrior
          > does. And CodeWarrior compiles smaller faster code than gcc
          > does. So MrCpp is the best in my experience.
          >
          > Basically I worry if enough compilers will be smart enough to
          > use the same code if templated.[/color]

          Check out Item 42 in _Effective C++_ (Meyers). There's a design
          pattern there, whose name I don't know, that answers those
          questions. You implement your interface generically once, to
          operate on void*. Users use typesafe templates that inherit that
          single generic implementation.

          --
          Neil Cerutti

          Comment

          Working...