Hunting a memory leak

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • Debian User

    Hunting a memory leak


    Hi,

    I'm trying to discover a memory leak on a program of mine. I've taken
    several approaches, but the leak still resists to appear.

    First of all, I've tried to use the garbage collector to look for
    uncollectable objects. I've used the next:

    # at the beginning of code
    gc.enable()
    gc.set_debug(gc .DEBUG_LEAK)

    <snip>

    # at the end of code:
    print "\nGARBAGE: "
    gc.collect()

    print "\nGARBAGE OBJECTS:"
    for x in gc.garbage:
    s = str(x)
    print type(x),"\n ", s

    With that, I get no garbage objects.

    Then I've taken an approach that I've seen in python developers list
    contributed by Walter Dörwald, that basically consists in creating a
    debug version of python, create a unitest with the leaking code, and
    modify the unittest.py to extract the increment of total reference
    counting in that code (see
    http://aspn.activestate.com/ASPN/Mai...n-dev/1770868).

    With that, I see that my reference count grows by one each time the
    test execute. But the problem is: is there some way to look at the
    object (or make a memory dump) that is leaking?.

    I've used valgrind (http://developer.kde.org/~sewardj/) to see if it
    could detect the leak. In fact, it detects a bunch of them, but I am
    afraid that they are not related with the leak I'm looking for. I am
    saying that because, when I loop over my leaky code, valgrind always
    report the same amount of leaky memory, independently of the number of
    iterations (while top is telling me that memory use is growing!).

    My code uses extension modules in C, so I am afraid this does not
    contribute to alleviate the problem. I think all the malloc are
    correctly freed, but I can't be sure (however, valgrind does not
    detect nothing wrong in the extension).

    I am sorry, but I cannot be more explicit about the code because it
    is quite complex (it is the PyTables package, http://pytables.sf.net),
    and I was unable to make a simple example to be published
    here. However, if anyone is tempted to have a look at the code, you
    can download it from
    (http://sourceforge.net/project/showf...roup_id=63486). I am
    attaching a unittest that exposes the leak.

    I am a bit desperate. Any hint?

    Francesc Alted

    --

    # Unittest to expose the memory leak
    import sys
    import unittest
    import os
    import tempfile

    from tables import *
    # Next imports are only necessary for this test suite
    #from tables import Group, Leaf, Table, Array

    verbose = 0

    class WideTreeTestCas e(unittest.Test Case):

    def test00_Leafs(se lf):

    import time
    maxchilds = 2
    if verbose:
    print '\n', '-=' * 30
    print "Running %s.test00_wideT ree..." % \
    self.__class__. __name__
    print "Maximum number of childs tested :", maxchilds
    # Open a new empty HDF5 file
    file = tempfile.mktemp (".h5")
    #file = "test_widetree. h5"

    fileh = openFile(file, mode = "w")
    if verbose:
    print "Children writing progress: ",
    for child in range(maxchilds ):
    if verbose:
    print "%3d," % (child),
    a = [1, 1]
    fileh.createGro up(fileh.root, 'group' + str(child),
    "child: %d" % child)
    # Comment the createArray call to see the leak disapear
    fileh.createArr ay("/group" + str(child), 'array' + str(child),
    a, "child: %d" % child)
    if verbose:
    print
    # Close the file
    fileh.close()


    #----------------------------------------------------------------------

    def suite():
    theSuite = unittest.TestSu ite()
    theSuite.addTes t(unittest.make Suite(WideTreeT estCase))

    return theSuite


    if __name__ == '__main__':
    unittest.main(d efaultTest='sui te')

  • Michael Hudson

    #2
    Re: Hunting a memory leak

    Debian User <falted@inspiro n.openlc.org> writes:
    [color=blue]
    > I'm trying to discover a memory leak on a program of mine. I've taken
    > several approaches, but the leak still resists to appear.
    >
    > First of all, I've tried to use the garbage collector to look for
    > uncollectable objects.[/color]

    [snip]
    [color=blue]
    > Then I've taken an approach that I've seen in python developers list
    > contributed by Walter Dörwald, that basically consists in creating a
    > debug version of python, create a unitest with the leaking code, and
    > modify the unittest.py to extract the increment of total reference
    > counting in that code (see
    > http://aspn.activestate.com/ASPN/Mai...n-dev/1770868).[/color]

    Well, somewhere in that same thread are various references to a
    TrackRefs class. Have you tried using that? It should tell you what
    type of object is leaking, which is a good start.
    [color=blue]
    > With that, I see that my reference count grows by one each time the
    > test execute. But the problem is: is there some way to look at the
    > object (or make a memory dump) that is leaking?.[/color]

    See above :-)
    [color=blue]
    > I've used valgrind (http://developer.kde.org/~sewardj/) to see if it
    > could detect the leak. In fact, it detects a bunch of them, but I am
    > afraid that they are not related with the leak I'm looking for. I am
    > saying that because, when I loop over my leaky code, valgrind always
    > report the same amount of leaky memory, independently of the number of
    > iterations (while top is telling me that memory use is growing!).[/color]

    There are various things (interned strings, f'ex) that always tend to
    be alive at the end of a Python program: these are only leaks in a
    very warped sense.

    I don't know if there's a way to get vaglrind to tell you what's
    allocated but not deallocated between two arbitrary points of program
    execution.
    [color=blue]
    > My code uses extension modules in C, so I am afraid this does not
    > contribute to alleviate the problem.[/color]

    Well, in all likelyhood, the bug is IN the C extension module. Have
    you tried stepping through the code in a debugger? Sometime's that's
    a good way of spotting a logic error.
    [color=blue]
    > I am sorry, but I cannot be more explicit about the code because it
    > is quite complex (it is the PyTables package, http://pytables.sf.net),
    > and I was unable to make a simple example to be published
    > here. However, if anyone is tempted to have a look at the code, you
    > can download it from
    > (http://sourceforge.net/project/showf...roup_id=63486). I am
    > attaching a unittest that exposes the leak.
    >
    > I am a bit desperate. Any hint?[/color]

    Not really. Try using TrackRefs.

    Cheers,
    mwh

    --
    I'm about to search Google for contract assassins to go to Iomega
    and HP's programming groups and kill everyone there with some kind
    of electrically charged rusty barbed thing.
    -- http://bofhcam.org/journal/journal.html, 2002-01-08

    Comment

    • Edward K. Ream

      #3
      Re: Hunting a memory leak

      > I'm trying to discover a memory leak on a program of mine. I've taken[color=blue]
      > several approaches, but the leak still resists to appear.[/color]

      First, single-stepping through C code is surprisingly effective. I heartily
      recommend it.

      Here are some ideas you might use if you are truly desperate. You will have
      to do some work to make them useful in your situation.

      1. Keep track of all newly-created objects. Warning: the id trick used in
      this code is not proper because newly allocated objects can have the same
      address as old objects, so you should devise a better way by creating a more
      unique hash. Or just use the code as is and see whether the "invalid" code
      tells you something ;-)

      global lastObjectsDict
      objects = gc.get_objects( )

      newObjects = [o for o in objects if not lastObjectsDict .has_key(id(o))]

      lastObjectsDict = {}
      for o in objects:
      lastObjectsDict[id(o)]=o

      2. Keep track of the number of objects.

      def printGc(message =None,onlyPrint Changes=false):

      if not debugGC: return None

      if not message:
      message = callerName(n=2) # Left as an exercise for the reader.

      global lastObjectCount

      try:
      n = len(gc.garbage)
      n2 = len(gc.get_obje cts())
      delta = n2-lastObjectCount
      if not onlyPrintChange s or delta:
      if n:
      print "garbage: %d, objects: %+6d =%7d %s" %
      (n,delta,n2,mes sage)
      else:
      print "objects: %+6d =%7d %s" %
      (n2-lastObjectCount ,n2,message)

      lastObjectCount = n2
      return delta
      except:
      traceback.print _exc()
      return None

      3. Print lots and lots of info...

      def printGcRefs (verbose=true):

      refs = gc.get_referrer s(app().windowL ist[0])
      print '-' * 30

      if verbose:
      print "refs of", app().windowLis t[0]
      for ref in refs:
      print type(ref)
      if 0: # very verbose
      if type(ref) == type({}):
      keys = ref.keys()
      keys.sort()
      for key in keys:
      val = ref[key]
      if isinstance(val, leoFrame.LeoFra me): # changes as
      needed
      print key,ref[key]
      else:
      print "%d referers" % len(refs)

      Here app().windowLis t is a key data structure of my app. Substitute your
      own as a new argument.

      Basically, Python will give you all the information you need. The problem
      is that there is way too much info, so you must experiment with filtering
      it. Don't panic: you can do it.

      4. A totally different approach. Consider this function:

      def clearAllIvars (o):

      """Clear all ivars of o, a member of some class."""

      o.__dict__.clea r()

      This function will grind concrete walls into grains of sand. The GC will
      then recover each grain separately.

      My app contains several classes that refer to each other. Rather than
      tracking all the interlocking references, when it comes time to delete the
      main data structure my app simply calls clearAllIvars for the various
      classes. Naturally, some care is needed to ensure that calls are made in
      the proper order.

      HTH.

      Edward
      --------------------------------------------------------------------
      Edward K. Ream email: edreamleo@chart er.net
      Leo: Literate Editor with Outlines
      Leo: http://webpages.charter.net/edreamleo/front.html
      --------------------------------------------------------------------


      Comment

      • Francesc Alted

        #4
        Re: Hunting a memory leak

        On 2003-08-29, Michael Hudson <mwh@python.net > wrote:[color=blue]
        > Debian User <falted@inspiro n.openlc.org> writes:
        >[color=green]
        >> I'm trying to discover a memory leak on a program of mine. I've taken
        >> several approaches, but the leak still resists to appear.
        >>
        >> First of all, I've tried to use the garbage collector to look for
        >> uncollectable objects.[/color]
        >
        > [snip]
        >[color=green]
        >> Then I've taken an approach that I've seen in python developers list
        >> contributed by Walter Dörwald, that basically consists in creating a
        >> debug version of python, create a unitest with the leaking code, and
        >> modify the unittest.py to extract the increment of total reference
        >> counting in that code (see
        >> http://aspn.activestate.com/ASPN/Mai...n-dev/1770868).[/color]
        >
        > Well, somewhere in that same thread are various references to a
        > TrackRefs class. Have you tried using that? It should tell you what
        > type of object is leaking, which is a good start.
        >[color=green]
        >> With that, I see that my reference count grows by one each time the
        >> test execute. But the problem is: is there some way to look at the
        >> object (or make a memory dump) that is leaking?.[/color]
        >
        > See above :-)
        >[color=green]
        >> I've used valgrind (http://developer.kde.org/~sewardj/) to see if it
        >> could detect the leak. In fact, it detects a bunch of them, but I am
        >> afraid that they are not related with the leak I'm looking for. I am
        >> saying that because, when I loop over my leaky code, valgrind always
        >> report the same amount of leaky memory, independently of the number of
        >> iterations (while top is telling me that memory use is growing!).[/color]
        >
        > There are various things (interned strings, f'ex) that always tend to
        > be alive at the end of a Python program: these are only leaks in a
        > very warped sense.
        >
        > I don't know if there's a way to get vaglrind to tell you what's
        > allocated but not deallocated between two arbitrary points of program
        > execution.
        >[color=green]
        >> My code uses extension modules in C, so I am afraid this does not
        >> contribute to alleviate the problem.[/color]
        >
        > Well, in all likelyhood, the bug is IN the C extension module. Have
        > you tried stepping through the code in a debugger? Sometime's that's
        > a good way of spotting a logic error.
        >[color=green]
        >> I am sorry, but I cannot be more explicit about the code because it
        >> is quite complex (it is the PyTables package, http://pytables.sf.net),
        >> and I was unable to make a simple example to be published
        >> here. However, if anyone is tempted to have a look at the code, you
        >> can download it from
        >> (http://sourceforge.net/project/showf...roup_id=63486). I am
        >> attaching a unittest that exposes the leak.
        >>
        >> I am a bit desperate. Any hint?[/color]
        >
        > Not really. Try using TrackRefs.
        >
        > Cheers,
        > mwh
        >[/color]

        Comment

        • Francesc Alted

          #5
          Re: Hunting a memory leak

          [Ooops. Something went wrong with my newsreader config ;-)]

          Thanks for the responses!. I started by fetching the TrackRefs() class
          from http://cvs.zope.org/Zope3/test.py and pasted it in my local copy
          of unittest.py. Then, I've modified the TestCase.__call __ try: block
          from the original:


          try:
          testMethod()
          ok = 1

          to read:

          try:
          rc1 = rc2 = None
          #Pre-heating
          for i in xrange(10):
          testMethod()
          gc.collect()
          rc1 = sys.gettotalref count()
          track = TrackRefs()
          # Second (first "valid") loop
          for i in xrange(10):
          testMethod()
          gc.collect()
          rc2 = sys.gettotalref count()
          print "First output of TrackRefs:"
          track.update()
          print >>sys.stderr, "%5d %s.%s.%s()" % (rc2-rc1,
          testMethod.__mo dule__, testMethod.im_c lass.__name__,
          testMethod.im_f unc.__name__)
          # Third loop
          for i in xrange(10):
          testMethod()
          gc.collect()
          rc3 = sys.gettotalref count()
          print "Second output of TrackRefs:"
          track.update()
          print >>sys.stderr, "%5d %s.%s.%s()" % (rc3-rc2,
          testMethod.__mo dule__, testMethod.im_c lass.__name__,
          testMethod.im_f unc.__name__)
          ok = 1

          However, I'm not sure if I have made a good implementation. My
          understanding is that the first loop is for pre-heating (to avoid
          false count-refs due to cache issues and so). The second loop should
          already give good count references and, thereby, I've made a call to
          track.update(). Finally, I wanted to re-check the results of the
          second loop with a third one. Therefore, I expected more or less the
          same results in second and third loops.

          But... the results are different!. Following are the results of this run:

          $ python2.3 widetree3.py
          First output of TrackRefs:
          <type 'str'> 13032 85335
          <type 'tuple'> 8969 38402
          <type 'Cfunc'> 1761 11931
          <type 'code'> 1215 4871
          <type 'function'> 1180 5189
          <type 'dict'> 841 4897
          <type 'builtin_functi on_or_method'> 516 2781
          <type 'int'> 331 3597
          <type 'wrapper_descri ptor'> 295 1180
          <type 'method_descrip tor'> 236 944
          <type 'classobj'> 145 1092
          <type 'module'> 107 734
          <type 'list'> 94 440
          <type 'type'> 86 1967
          <type 'getset_descrip tor'> 84 336
          <type 'weakref'> 75 306
          <type 'float'> 73 312
          <type 'member_descrip tor'> 70 280
          <type 'ufunc'> 52 364
          <type 'instance'> 42 435
          <type 'instancemethod '> 41 164
          <class 'numarray.ufunc ._BinaryUFunc'> 25 187
          <class 'numarray.ufunc ._UnaryUFunc'> 24 173
          <type 'frame'> 9 44
          <type 'long'> 7 28
          <type 'property'> 6 25
          <type 'PyCObject'> 4 20
          <class 'unittest.TestS uite'> 3 31
          <type 'file'> 3 23
          <type 'listiterator'> 3 12
          <type 'bool'> 2 41
          <class 'random.Random' > 2 30
          <type '_sre.SRE_Patte rn'> 2 9
          <type 'complex'> 2 8
          <type 'thread.lock'> 2 8
          <type 'NoneType'> 1 2371
          <class 'unittest._Text TestResult'> 1 16
          <type 'ellipsis'> 1 12
          <class '__main__.WideT reeTestCase'> 1 11
          <class 'tables.IsDescr iption.metaIsDe scription'> 1 10
          <class 'unittest.TestP rogram'> 1 9
          <class 'numarray.ufunc ._ChooseUFunc'> 1 8
          <class 'unittest.TestL oader'> 1 7
          <class 'unittest.Track Refs'> 1 6
          <class 'unittest.TextT estRunner'> 1 6
          <type 'NotImplemented Type'> 1 6
          <class 'numarray.ufunc ._PutUFunc'> 1 5
          <class 'numarray.ufunc ._TakeUFunc'> 1 5
          <class 'unittest._Writ elnDecorator'> 1 5
          <type 'staticmethod'> 1 4
          <type 'classmethod'> 1 4
          <type 'classmethod_de scriptor'> 1 4
          <type 'unicode'> 1 4
          7 __main__.WideTr eeTestCase.test 00_Leafs()
          Second output of TrackRefs:
          <type 'int'> 37 218
          <type 'type'> 0 74
          212 __main__.WideTr eeTestCase.test 00_Leafs()
          ..
          ----------------------------------------------------------------------
          Ran 1 test in 0.689s

          OK
          [21397 refs]
          $

          As you can see, for the second loop (first output of TrackRefs), a lot
          of objects appear, but after the third loop (second output of
          TrackRefs), much less appear (only objects of type "int" and
          "type"). Besides, the increment of the total references for the second
          loop is only 7 while for the third loop is 212. Finally, to add even
          more confusion, these numbers are *totally* independent of the number
          of iterations I put in the loops. You see 10 in the code, but you can
          try with 100 (in one or all the loops) and you get exactly the same
          figures.

          I definitely think that I have made a bad implementation of the try:
          code block, but I can't figure out what's going wrong.

          I would appreciate some ideas.

          Francesc Alted

          Comment

          • Edward K. Ream

            #6
            Re: Hunting a memory leak

            > I would appreciate some ideas.

            I doubt many people will be willing to rummage through your app's code to do
            your debugging for you. Here are two general ideas:

            1. Try to simplify the problem. Pick something, no matter how small (and
            the smaller the better) that doesn't seem to be correct and do what it takes
            to find out why it isn't correct. If trackRefs is Python code you can hack
            that code to give you more (or less!) info. Once you discover the answer to
            one mystery, the larger mysteries may become clearer. For example, you can
            concentrate on one particular data structure, one particular data type or
            one iteration of your test suite.

            2. Try to enjoy the problem. The late great Earl Nightingale had roughly
            this advice: Don't worry. Simply consider the problem calmly, and have
            confidence that the solution will eventually come to you, probably when you
            are least expecting it. I've have found that this advice really works, and
            it works for almost any problem. Finding "worthy" bugs is a creative
            process, and creativity can be and should be highly enjoyable.

            In this case, your problem is: "how to start finding my memory leaks".
            Possible answers to this problem might be various strategies for getting
            more (or more focused!) information. Then you have new problems: how to
            implement the various strategies. In all cases, the advice to be calm and
            patient applies. Solving this problem will be highly valuable to you, no
            matter how long it takes :-)

            Edward

            P.S. And don't hesitate to ask more questions, especially once you have more
            concrete data or mysteries.

            EKR
            --------------------------------------------------------------------
            Edward K. Ream email: edreamleo@chart er.net
            Leo: Literate Editor with Outlines
            Leo: http://webpages.charter.net/edreamleo/front.html
            --------------------------------------------------------------------


            Comment

            • Francesc Alted

              #7
              Re: Hunting a memory leak

              On 2003-08-30, Edward K. Ream <edreamleo@char ter.net> wrote:[color=blue][color=green]
              >> I would appreciate some ideas.[/color]
              >
              > I doubt many people will be willing to rummage through your app's code to do
              > your debugging for you. Here are two general ideas:[/color]

              Thanks for the words of encouragement. After the weekend I'm more fresh and
              try to follow your suggestions (and those of Earl Nightingale ;-).

              Cheers,

              Francesc Alted

              Comment

              • Michael Hudson

                #8
                Re: Hunting a memory leak

                Francesc Alted <falted@openlc. org> writes:
                [color=blue]
                >
                > As you can see, for the second loop (first output of TrackRefs), a lot
                > of objects appear, but after the third loop (second output of
                > TrackRefs), much less appear (only objects of type "int" and
                > "type"). Besides, the increment of the total references for the second
                > loop is only 7 while for the third loop is 212. Finally, to add even
                > more confusion, these numbers are *totally* independent of the number
                > of iterations I put in the loops. You see 10 in the code, but you can
                > try with 100 (in one or all the loops) and you get exactly the same
                > figures.
                >
                > I definitely think that I have made a bad implementation of the try:
                > code block, but I can't figure out what's going wrong.
                >
                > I would appreciate some ideas.[/color]

                In my experience of hunting these you want to call gc.collect() and
                track.update() *inside* the loops. Other functions you might want to
                call are things like sre.purge(), _strptime.clear _cache(),
                linecache.clear cache()... there's a seemingly unbounded number of
                caches around that can interfere.

                Cheers,
                mwh

                --
                A difference which makes no difference is no difference at all.
                -- William James (I think. Reference anyone?)

                Comment

                • Will Ware

                  #9
                  Re: Hunting a memory leak

                  Debian User wrote:[color=blue]
                  > I'm trying to discover a memory leak on a program of mine...[/color]

                  Several years ago, I came up with a memory leak detector that I used for
                  C extensions with Python 1.5.2. This was before there were gc.* methods
                  available, and I'm guessing they probably do roughly the same things.
                  Still, in the unlikely event it's helpful:


                  Now that I think of it, this might be helpful after all. With this
                  approach, you're checking the total refcount at various points in the
                  loop in your C code, rather than only in the Python code. Take a look
                  anyway.

                  Good luck
                  Will Ware

                  Comment

                  • Francesc Alted

                    #10
                    Re: Hunting a memory leak

                    Edward K. Ream wrote:
                    [color=blue]
                    > Here are two general ideas:
                    >
                    > 1. Try to simplify the problem. Pick something, no matter how small (and
                    > the smaller the better) that doesn't seem to be correct and do what it
                    > takes to find out why it isn't correct.[/color]

                    Yeah... using this approach I was finally able to hunt the leak!!!.

                    The problem was hidden in C code that is used to access to a C library. I'm
                    afraid that valgrind was unable to detect that because the underlying C
                    library does not call the standard malloc to create the leaking objects.

                    Of course, the Python reference counters were unable to detect that as well
                    (although some black points still remain, but not very important).

                    Anyway, thanks very much for the advices and encouragement!

                    Francesc Alted

                    Comment

                    • Edward K. Ream

                      #11
                      Re: Hunting a memory leak

                      > Yeah... using this approach I was finally able to hunt the leak!!!.
                      ....[color=blue]
                      > Anyway, thanks very much for the advices and encouragement![/color]

                      You are welcome. IMO, if you can track down memory problems in C you can
                      debug just about anything, with the notable exception of numeric programs.
                      Debugging numeric calculations is hard, and will always remain so.

                      Edward
                      --------------------------------------------------------------------
                      Edward K. Ream email: edreamleo@chart er.net
                      Leo: Literate Editor with Outlines
                      Leo: http://webpages.charter.net/edreamleo/front.html
                      --------------------------------------------------------------------


                      Comment

                      Working...