Debugging memory leaks

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • fmassei@gmail.com

    Debugging memory leaks

    Hello!
    I made a short piece of code that I find very useful for debugging,
    and I wanted to ask you if it is correct, somehow acceptable or if I
    simply reinvented the wheel.
    To deal with some bad bugs caused by memory leaks I ended up with this
    simple solution: I made one header file that, when included, replaces
    the malloc/calloc/realloc/free functions with some other functions
    that do the actual job and insert (or remove) the pointers to the
    allocated memory in a list. This way I can catch many errors as double
    frees, allocations of zero size, or missing calls to free.
    Obviously it works only when a given #define is set (NDEBUG, in my
    case).
    What do you think about it?
    Thank you in advance!


    (If you want to see the code I wrote a little page here:
    http://technicalinsanity.org/out/simplegc/index.html)
  • Ben Pfaff

    #2
    Re: Debugging memory leaks

    fmassei@gmail.c om writes:
    I made a short piece of code that I find very useful for debugging,
    and I wanted to ask you if it is correct, somehow acceptable or if I
    simply reinvented the wheel.
    To deal with some bad bugs caused by memory leaks I ended up with this
    simple solution: I made one header file that, when included, replaces
    the malloc/calloc/realloc/free functions with some other functions
    that do the actual job and insert (or remove) the pointers to the
    allocated memory in a list. This way I can catch many errors as double
    frees, allocations of zero size, or missing calls to free.
    Obviously it works only when a given #define is set (NDEBUG, in my
    case).
    What do you think about it?
    You have reinvented the wheel. Many programmers have done
    similarly to debug memory leaks. (That doesn't make it any less
    useful to do so.)
    --
    Ben Pfaff

    Comment

    • dj3vande@csclub.uwaterloo.ca.invalid

      #3
      Re: Debugging memory leaks

      In article <b18ff9b0-0670-44c7-b4f5-971bd4c185aa@t3 9g2000prh.googl egroups.com>,
      <fmassei@gmail. comwrote:
      >Hello!
      >I made a short piece of code that I find very useful for debugging,
      >and I wanted to ask you if it is correct, somehow acceptable or if I
      >simply reinvented the wheel.
      >To deal with some bad bugs caused by memory leaks I ended up with this
      >simple solution: I made one header file that, when included, replaces
      >the malloc/calloc/realloc/free functions with some other functions
      >that do the actual job and insert (or remove) the pointers to the
      >allocated memory in a list. This way I can catch many errors as double
      >frees, allocations of zero size, or missing calls to free.
      >Obviously it works only when a given #define is set (NDEBUG, in my
      >case).
      >What do you think about it?
      NDEBUG is usually #define'd to indicate that you DON'T want debugging
      features turned on in the code. So if I'm understanding you correctly
      and you're using it to turn on your debugging malloc family
      replacement, that's probably Not Such A Good Idea.

      Other than that, it sounds like you've reinvented a particularly useful
      wheel; if you don't already have a wheel of that size and load-bearing-
      ness, your version will probably do fine.

      <OT>
      The standard library malloc on my Mac supports most of the checking
      features you describe, and allows them to be turned on by setting
      appropriate environment variables before running the program. I
      strongly suspect that this was inherited from FreeBSD, and would be
      unsurprised if other BSDs have similar debugging support.

      If you're using Linux/x86, you should take a look at Valgrind, which
      can do all of that and more.
      </OT>

      Note that by replacing malloc and friends, you're breaking the contract
      that the definition of the C language specifies between you and the
      implementor of your compiler; it's possible that you'll break something
      by doing this.
      (For debugging memory problems, this is probably acceptable. The worst
      that can happen is that you break things in a way that interferes with
      your debugging, and even in that case reverting to the non-malloc-
      debugging build will leave you back where you started, and all you lose
      is the getting a little bit farther ahead that you were hoping for.)

      If you haven't already implemented them (I didn't look at your code),
      you might also find a few wrapper macros useful:
      --------
      #undef malloc
      #define malloc(sz) my_malloc(sz,__ FILE__,__LINE__ )
      #undef free
      #define free(ptr) my_free(ptr,__F ILE__,__LINE__)
      #undef realloc
      #define realloc(ptr,sz) my_realloc(ptr, sz,__FILE__,__L INE__)
      #undef calloc
      #define calloc(num,sz) my_calloc(num,s z,__FILE__,__LI NE__)
      --------
      This allows your debugging versions (which will need to take filename
      and line arguments, of course) to track where they were called from
      and report that when they detect problems.


      dave

      --
      Dave Vandervies dj3vande at eskimo dot com
      You can save yourself a lot of work by just making up some results;
      they'll be just as good as those you'd get if you actually ran the
      survey. --Eric Sosman in comp.lang.c

      Comment

      • jameskuyper@verizon.net

        #4
        Re: Debugging memory leaks

        fmas...@gmail.c om wrote:
        Hello!
        I made a short piece of code that I find very useful for debugging,
        and I wanted to ask you if it is correct, somehow acceptable or if I
        simply reinvented the wheel.
        To deal with some bad bugs caused by memory leaks I ended up with this
        simple solution: I made one header file that, when included, replaces
        the malloc/calloc/realloc/free functions with some other functions
        that do the actual job and insert (or remove) the pointers to the
        allocated memory in a list. This way I can catch many errors as double
        frees, allocations of zero size, or missing calls to free.
        Obviously it works only when a given #define is set (NDEBUG, in my
        case).
        What do you think about it?
        By convention, macros such as gc_need_real should be all upper-case.
        Of course, this is only a convention, but a useful one. If it is
        #defined, you should keep in mind that there may already be standard
        library macros defined with the same names you use. You should #undef
        any such definitions before replacing them with your own. In
        principle, if gc_need_real is #defined, the behavior of any program
        which calls the normal malloc() familiy is undefined. In practice,
        replacement of standard library functions often works, and is a
        popular method of performing memory leak checks.

        Keep in mind that your approach is completely useless for
        investigating memory leaks that are due to direct calls to the
        malloc() family from libraries that were built without using your
        header files.

        You are quite right to worry about whether you are "reinventin g the
        wheel". You should check to see whether a debugging version of the
        malloc() family is provided by your implementation. If one is
        provided, it's probably safer to use than your own replacement. Also,
        this same technique is used by a number of memory leak testing
        packages, some of them much more sophisticated than anything you could
        easily write. In both cases, the replacement library functions
        actually replace the normal standard library functions, rather than
        wrapping them. As a result, you can even detect memory leaks that
        occur in libraries that were compiled without using your header file.

        Comment

        • Peter Nilsson

          #5
          Re: Debugging memory leaks

          dj3va...@csclub .uwaterloo.ca.i nvalid wrote:
          If you haven't already implemented them (I didn't look
          at your code), you might also find a few wrapper macros
          useful:
          --------
          #undef malloc
          #define malloc(sz) my_malloc(sz,__ FILE__,__LINE__ )
          #undef free
          #define free(ptr) my_free(ptr,__F ILE__,__LINE__)
          #undef realloc
          #define realloc(ptr,sz) my_realloc(ptr, sz,__FILE__,__L INE__)
          #undef calloc
          #define calloc(num,sz) my_calloc(num,s z,__FILE__,__LI NE__)
          --------
          These potentially cancel the idempotence of the <stdlib.h>
          header. Better to supply (and use) genuine wrappers...

          #ifndef NDEBUG
          #define wrap_malloc(sz) my_malloc(sz, __FILE__, __LINE__)
          #else
          #define wrap_malloc(sz) malloc(sz)
          #endif

          --
          Peter

          Comment

          • Peter Nilsson

            #6
            Re: Debugging memory leaks

            fmas...@gmail.c om wrote:
            (If you want to see the code I wrote a little page
            here:http://technicalinsanity.org/out/simplegc/index.html)
            Some obvious things from a quick glance... some are style
            issues, many are correctness issues...

            Your include guards use identifiers reserved for the
            implementation. Suggest you replace __SIMPLEGC_H__
            with H_SIMPLEGC_H.

            You might want to swap the comments...

            #include <stdlib.h /* needed: printf */
            #include <stdio.h /* needed: malloc/calloc/realloc/free */

            Better still, delete them.

            You incorrectly print size_t values with %d and pointers
            with %x.

            You should print debugging comments to stderr, not stdout.

            You should use const char *, instead of char * when
            declaring parameters that don't change the pointed to
            string...

            void *gc_malloc(size _t size, char *fname, size_t fline);

            Also, I'd go with long, not size_t for fline. I've used
            implementations where size_t's range is 0..65535, but
            __LINE__ could exceed that. Of course, the same problem
            exists if there are LONG_MAX lines, but I was much less
            concerned about that. ;)

            Note: NDEBUG means 'NO DEBUG'.

            In any case, I suggest that you _not_ make your utility
            functions subject to NDEBUG. Simply define the functions
            you need. If they aren't used, they aren't used. Smart
            linkers will remove them, but even if they don't, suppose
            you're linking two modules where you want the debugging
            on one, but not the other.

            Two things regarding...

            if ((p->fname = malloc(strlen(f name)+1))==NULL ) {

            Supressing the macro with (malloc)(...) is simpler than
            the gv_need_real kludge. Also, rather than copying the
            string, I'd just assign it and put a condition on your
            function that they must constant static duration strings,
            e.g. string literals. [I doubt this will worry anyone
            who uses your function.]

            --
            Peter

            Comment

            • Nate Eldredge

              #7
              Re: Debugging memory leaks

              Peter Nilsson <airia@acay.com .auwrites:
              Also, I'd go with long, not size_t for fline. I've used
              implementations where size_t's range is 0..65535, but
              __LINE__ could exceed that. Of course, the same problem
              exists if there are LONG_MAX lines, but I was much less
              concerned about that. ;)
              Hmm. On such implementations , assuming `int' was also 16 bits, did an
              occurence of __LINE__ on line 65537 expand to `65537L'? It seems like
              it would be a bug if it didn't.

              This is an interesting issue, because ordinarily the preprocessor
              wouldn't know about things like the ranges for types. Also, people use
              C preprocessors for a lot of non-C languages, and I bet they don't
              expect that behavior.

              Comment

              • Keith Thompson

                #8
                Re: Debugging memory leaks

                Nate Eldredge <nate@vulcan.la nwrites:
                Peter Nilsson <airia@acay.com .auwrites:
                >Also, I'd go with long, not size_t for fline. I've used
                >implementation s where size_t's range is 0..65535, but
                >__LINE__ could exceed that. Of course, the same problem
                >exists if there are LONG_MAX lines, but I was much less
                >concerned about that. ;)
                >
                Hmm. On such implementations , assuming `int' was also 16 bits, did an
                occurence of __LINE__ on line 65537 expand to `65537L'? It seems like
                it would be a bug if it didn't.
                The L suffix isn't necessary. If int is 16 bits, then the unadorned
                constant 65537 is of type long.

                [...]

                --
                Keith Thompson (The_Other_Keit h) kst-u@mib.org <http://www.ghoti.net/~kst>
                Nokia
                "We must do something. This is something. Therefore, we must do this."
                -- Antony Jay and Jonathan Lynn, "Yes Minister"

                Comment

                • Nate Eldredge

                  #9
                  Re: Debugging memory leaks

                  Keith Thompson <kst-u@mib.orgwrites :
                  Nate Eldredge <nate@vulcan.la nwrites:
                  >Peter Nilsson <airia@acay.com .auwrites:
                  >>Also, I'd go with long, not size_t for fline. I've used
                  >>implementatio ns where size_t's range is 0..65535, but
                  >>__LINE__ could exceed that. Of course, the same problem
                  >>exists if there are LONG_MAX lines, but I was much less
                  >>concerned about that. ;)
                  >>
                  >Hmm. On such implementations , assuming `int' was also 16 bits, did an
                  >occurence of __LINE__ on line 65537 expand to `65537L'? It seems like
                  >it would be a bug if it didn't.
                  >
                  The L suffix isn't necessary. If int is 16 bits, then the unadorned
                  constant 65537 is of type long.
                  Aha. 6.4.4.1 (5). Thanks. I think I've been using the L suffix
                  unnecessarily all along.

                  Comment

                  • fmassei@gmail.com

                    #10
                    Re: Debugging memory leaks

                    Thank you all, you gave me a lot of inputs and good suggestions! I'm
                    going to modify the code asap.
                    Regarding the use of implementation dependent tools of course you are
                    right, but I'd like to have an implementation independent tool (even
                    if very very simple) just to be free to debug the applications in all
                    the target environments. That is the main reason because I prefer not
                    to use very advanced softwares like valgrind, even if sometimes they
                    can actually save your day.

                    Comment

                    • Chris Ahlstrom

                      #11
                      Re: Debugging memory leaks

                      After takin' a swig o' grog, fmassei@gmail.c om belched out
                      this bit o' wisdom:
                      Hello!
                      I made a short piece of code that I find very useful for debugging,
                      and I wanted to ask you if it is correct, somehow acceptable or if I
                      simply reinvented the wheel.
                      To deal with some bad bugs caused by memory leaks I ended up with this
                      simple solution: I made one header file that, when included, replaces
                      the malloc/calloc/realloc/free functions with some other functions
                      that do the actual job and insert (or remove) the pointers to the
                      allocated memory in a list. This way I can catch many errors as double
                      frees, allocations of zero size, or missing calls to free.
                      Obviously it works only when a given #define is set (NDEBUG, in my
                      case).
                      >
                      (If you want to see the code I wrote a little page here:
                      http://technicalinsanity.org/out/simplegc/index.html)
                      If you're on a platform that supports it, you can use valgrind
                      to check for leaks and other issues. It's pretty cool:

                      Official Home Page for valgrind, a suite of tools for debugging and profiling. Automatically detect memory management and threading bugs, and perform detailed profiling. The current stable version is valgrind-3.26.0.


                      Also Electric Fence:




                      --
                      One does not thank logic.
                      -- Sarek, "Journey to Babel", stardate 3842.4

                      Comment

                      • blargg

                        #12
                        Re: Debugging memory leaks

                        In article <lnbpxmfkrp.fsf @nuthaus.mib.or g>, Keith Thompson
                        <kst-u@mib.orgwrote:
                        Nate Eldredge <nate@vulcan.la nwrites:
                        Peter Nilsson <airia@acay.com .auwrites:
                        Also, I'd go with long, not size_t for fline. I've used
                        implementations where size_t's range is 0..65535, but
                        __LINE__ could exceed that. Of course, the same problem
                        exists if there are LONG_MAX lines, but I was much less
                        concerned about that. ;)
                        Hmm. On such implementations , assuming `int' was also 16 bits, did an
                        occurence of __LINE__ on line 65537 expand to `65537L'? It seems like
                        it would be a bug if it didn't.
                        >
                        The L suffix isn't necessary. If int is 16 bits, then the unadorned
                        constant 65537 is of type long.
                        >
                        [...]
                        If int is 16 bits, wouldn't the constant 32768 also be of type long int
                        (and 32767 of type int)? 3.1.3.2 in C89, 6.4.4.1 in C99.

                        Comment

                        • Paul Hsieh

                          #13
                          Re: Debugging memory leaks

                          On Oct 14, 3:03 pm, fmas...@gmail.c om wrote:
                          Hello!
                          I made a short piece of code that I find very useful for debugging,
                          and I wanted to ask you if it is correct, somehow acceptable or if I
                          simply reinvented the wheel.
                          Yes, you are reinventing the wheel. However, the C standard
                          practically *demands* that you re-invent the wheel, because modern
                          systems require that you make numerous complicated decisions in
                          implementing this that the C standard does not address. (I.e., the
                          right answer is that the C standard should simply *provide* greater
                          dynamic memory functionality which makes this either easier to
                          implement or fall trivially out of the C standard. Force the compiler
                          vendors to make the platform more portable, not developers.)
                          To deal with some bad bugs caused by memory leaks I ended up with this
                          simple solution: I made one header file that, when included, replaces
                          the malloc/calloc/realloc/free functions with some other functions
                          that do the actual job and insert (or remove) the pointers to the
                          allocated memory in a list.
                          Well ... if all you want to do is see if you are leaking, you could
                          just keep a count of the size of each allocation in a hidden header,
                          and just add and subtract as necessary and then print the last total
                          in an atexit() routine or something like that.

                          This is the real problem with the C standard, is that you can do a
                          *LOT* more than that, that is really useful. You can use special
                          patterns guards to detect bounds corruptions and also keep minimum and
                          maximum address bounds (on architectures where that makes sense) to
                          quickly detect garbage pointers being freed or realloced. The C
                          standard could expose functions like

                          int isLiveAllocatio n (const void *ptr);
                          size_t totalAllocatedM em (void);
                          size_t memSize (const void * ptr);
                          int walkEachAllocat ion (int (* walker) (const void * ptr, size_t
                          sz, void * ctx), void * ctx);

                          which are all fairly easily implementable with any serious dynamic
                          memory architecture. The value of such functions is so obvious,
                          especially in light of what you are trying to do.

                          There are also special problems that you need to deal with on some
                          platforms. strdup() makes direct system calls to malloc. So if you
                          try to free it with an overloaded free() macro, then you may encounter
                          problems. Basically, you need to redefine and make your own strdup as
                          well. You don't need to do this with functions like fopen, of course,
                          since they provide their own clean up (i.e., fclose).
                          [...] This way I can catch many errors as double
                          frees, allocations of zero size, or missing calls to free.
                          Obviously it works only when a given #define is set (NDEBUG, in my
                          case).
                          What do you think about it?
                          Thank you in advance!
                          >
                          (If you want to see the code I wrote a little page here:http://technicalinsanity.org/out/simplegc/index.html)
                          Its fine as a first pass kind of thing.

                          The main problem, of course, is the free is horrendously slow for
                          large numbers of outstanding allocations. There is a common trick of
                          making the header appear at the address: ((char*)ptr) - sizeof
                          (header) which allows you to work around this performance problem.
                          The idea is that a pointer range check, alignment and header signature
                          check are very highly probabilistical ly good enough to detect a purely
                          bogus pointer. Its important to support programs that have a massive
                          number of memory allocations outstanding because that's where you will
                          get primary value from such a debugging mechanism.

                          And of course, this is pretty useless in multithreaded environments.
                          You need to make some sort of abstraction for a mutex or lock for your
                          memory (if you care about portability, otherwise you can just go ahead
                          and use the platform's specific mutexes). You can be clever and use a
                          hash on the value of ptr to create an array of striped locks to
                          increase parallelism (since malloc itself will be hit with a higher
                          degree of parallelism than you would otherwise encounter if you are
                          using a single lock), but that might be overkill.

                          --
                          Paul Hsieh
                          Pobox has been discontinued as a separate service, and all existing customers moved to the Fastmail platform.


                          Comment

                          • user923005

                            #14
                            Re: Debugging memory leaks

                            On Oct 15, 7:24 am, Chris Ahlstrom <lino...@bollso uth.nutwrote:
                            After takin' a swig o' grog, fmas...@gmail.c om belched out
                              this bit o' wisdom:
                            >
                            Hello!
                            I made a short piece of code that I find very useful for debugging,
                            and I wanted to ask you if it is correct, somehow acceptable or if I
                            simply reinvented the wheel.
                            To deal with some bad bugs caused by memory leaks I ended up with this
                            simple solution: I made one header file that, when included, replaces
                            the malloc/calloc/realloc/free functions with some other functions
                            that do the actual job and insert (or remove) the pointers to the
                            allocated memory in a list. This way I can catch many errors as double
                            frees, allocations of zero size, or missing calls to free.
                            Obviously it works only when a given #define is set (NDEBUG, in my
                            case).
                            >
                            (If you want to see the code I wrote a little page here:
                            http://technicalinsanity.org/out/simplegc/index.html)
                            >
                            If you're on a platform that supports it, you can use valgrind
                            to check for leaks and other issues.  It's pretty cool:
                            >
                               http://valgrind.org/
                            >
                            Also Electric Fence:
                            >
                               http://perens.com/works/software/
                            I really like this thing:
                            http://en.wikipedia.org/wiki/Duma_(software)

                            This general article may prove helpful for the OP:

                            Comment

                            • Lorenzo Villari

                              #15
                              Re: Debugging memory leaks


                              "Paul Hsieh" <websnarf@gmail .comha scritto nel messaggio
                              news:869645a4-a8a4-4f61-b726-32a2638a6440@k1 6g2000hsf.googl egroups.com...
                              Yes, you are reinventing the wheel. However, the C standard
                              practically *demands* that you re-invent the wheel, because modern
                              systems require that you make numerous complicated decisions in
                              implementing this that the C standard does not address. (I.e., the
                              right answer is that the C standard should simply *provide* greater
                              dynamic memory functionality which makes this either easier to
                              implement or fall trivially out of the C standard. Force the compiler
                              vendors to make the platform more portable, not developers.)
                              That's because you can do the same thing in many different ways. The next
                              day they add this to standard C, some people would say they don't like the
                              way that was implemented, they could do that better etc etc So I think wheel
                              reinventing will never end...


                              Comment

                              Working...