Hunting a memory leak

**Michael Hudson** · Jul 18 '05, 02:07 AM

Re: Hunting a memory leak

Debian User <falted@inspiro n.openlc.org> writes:
[color=blue]
> I'm trying to discover a memory leak on a program of mine. I've taken
> several approaches, but the leak still resists to appear.
>
> First of all, I've tried to use the garbage collector to look for
> uncollectable objects.[/color]

[snip]
[color=blue]
> Then I've taken an approach that I've seen in python developers list
> contributed by Walter Dörwald, that basically consists in creating a
> debug version of python, create a unitest with the leaking code, and
> modify the unittest.py to extract the increment of total reference
> counting in that code (see
> http://aspn.activestate.com/ASPN/Mai...n-dev/1770868).[/color]

Well, somewhere in that same thread are various references to a
TrackRefs class. Have you tried using that? It should tell you what
type of object is leaking, which is a good start.
[color=blue]
> With that, I see that my reference count grows by one each time the
> test execute. But the problem is: is there some way to look at the
> object (or make a memory dump) that is leaking?.[/color]

See above :-)
[color=blue]
> I've used valgrind (http://developer.kde.org/~sewardj/) to see if it
> could detect the leak. In fact, it detects a bunch of them, but I am
> afraid that they are not related with the leak I'm looking for. I am
> saying that because, when I loop over my leaky code, valgrind always
> report the same amount of leaky memory, independently of the number of
> iterations (while top is telling me that memory use is growing!).[/color]

There are various things (interned strings, f'ex) that always tend to
be alive at the end of a Python program: these are only leaks in a
very warped sense.

I don't know if there's a way to get vaglrind to tell you what's
allocated but not deallocated between two arbitrary points of program
execution.
[color=blue]
> My code uses extension modules in C, so I am afraid this does not
> contribute to alleviate the problem.[/color]

Well, in all likelyhood, the bug is IN the C extension module. Have
you tried stepping through the code in a debugger? Sometime's that's
a good way of spotting a logic error.
[color=blue]
> I am sorry, but I cannot be more explicit about the code because it
> is quite complex (it is the PyTables package, http://pytables.sf.net),
> and I was unable to make a simple example to be published
> here. However, if anyone is tempted to have a look at the code, you
> can download it from
> (http://sourceforge.net/project/showf...roup_id=63486). I am
> attaching a unittest that exposes the leak.
>
> I am a bit desperate. Any hint?[/color]

Not really. Try using TrackRefs.

Cheers,
mwh

--
I'm about to search Google for contract assassins to go to Iomega
and HP's programming groups and kill everyone there with some kind
of electrically charged rusty barbed thing.
-- http://bofhcam.org/journal/journal.html, 2002-01-08

**Edward K. Ream** · Jul 18 '05, 02:07 AM

Re: Hunting a memory leak

> I'm trying to discover a memory leak on a program of mine. I've taken[color=blue]
> several approaches, but the leak still resists to appear.[/color]

First, single-stepping through C code is surprisingly effective. I heartily
recommend it.

Here are some ideas you might use if you are truly desperate. You will have
to do some work to make them useful in your situation.

1. Keep track of all newly-created objects. Warning: the id trick used in
this code is not proper because newly allocated objects can have the same
address as old objects, so you should devise a better way by creating a more
unique hash. Or just use the code as is and see whether the "invalid" code
tells you something ;-)

global lastObjectsDict
objects = gc.get_objects( )

newObjects = [o for o in objects if not lastObjectsDict .has_key(id(o))]

lastObjectsDict = {}
for o in objects:
lastObjectsDict[id(o)]=o

2. Keep track of the number of objects.

def printGc(message =None,onlyPrint Changes=false):

if not debugGC: return None

if not message:
message = callerName(n=2) # Left as an exercise for the reader.

global lastObjectCount

try:
n = len(gc.garbage)
n2 = len(gc.get_obje cts())
delta = n2-lastObjectCount
if not onlyPrintChange s or delta:
if n:
print "garbage: %d, objects: %+6d =%7d %s" %
(n,delta,n2,mes sage)
else:
print "objects: %+6d =%7d %s" %
(n2-lastObjectCount ,n2,message)

lastObjectCount = n2
return delta
except:
traceback.print _exc()
return None

3. Print lots and lots of info...

def printGcRefs (verbose=true):

refs = gc.get_referrer s(app().windowL ist[0])
print '-' * 30

if verbose:
print "refs of", app().windowLis t[0]
for ref in refs:
print type(ref)
if 0: # very verbose
if type(ref) == type({}):
keys = ref.keys()
keys.sort()
for key in keys:
val = ref[key]
if isinstance(val, leoFrame.LeoFra me): # changes as
needed
print key,ref[key]
else:
print "%d referers" % len(refs)

Here app().windowLis t is a key data structure of my app. Substitute your
own as a new argument.

Basically, Python will give you all the information you need. The problem
is that there is way too much info, so you must experiment with filtering
it. Don't panic: you can do it.

4. A totally different approach. Consider this function:

def clearAllIvars (o):

"""Clear all ivars of o, a member of some class."""

o.__dict__.clea r()

This function will grind concrete walls into grains of sand. The GC will
then recover each grain separately.

My app contains several classes that refer to each other. Rather than
tracking all the interlocking references, when it comes time to delete the
main data structure my app simply calls clearAllIvars for the various
classes. Naturally, some care is needed to ensure that calls are made in
the proper order.

HTH.

Edward
--------------------------------------------------------------------
Edward K. Ream email: edreamleo@chart er.net
Leo: Literate Editor with Outlines
Leo: http://webpages.charter.net/edreamleo/front.html
--------------------------------------------------------------------

**Francesc Alted** · Jul 18 '05, 02:08 AM

Re: Hunting a memory leak

On 2003-08-29, Michael Hudson <mwh@python.net > wrote:[color=blue]
> Debian User <falted@inspiro n.openlc.org> writes:
>[color=green]
>> I'm trying to discover a memory leak on a program of mine. I've taken
>> several approaches, but the leak still resists to appear.
>>
>> First of all, I've tried to use the garbage collector to look for
>> uncollectable objects.[/color]
>
> [snip]
>[color=green]
>> Then I've taken an approach that I've seen in python developers list
>> contributed by Walter Dörwald, that basically consists in creating a
>> debug version of python, create a unitest with the leaking code, and
>> modify the unittest.py to extract the increment of total reference
>> counting in that code (see
>> http://aspn.activestate.com/ASPN/Mai...n-dev/1770868).[/color]
>
> Well, somewhere in that same thread are various references to a
> TrackRefs class. Have you tried using that? It should tell you what
> type of object is leaking, which is a good start.
>[color=green]
>> With that, I see that my reference count grows by one each time the
>> test execute. But the problem is: is there some way to look at the
>> object (or make a memory dump) that is leaking?.[/color]
>
> See above :-)
>[color=green]
>> I've used valgrind (http://developer.kde.org/~sewardj/) to see if it
>> could detect the leak. In fact, it detects a bunch of them, but I am
>> afraid that they are not related with the leak I'm looking for. I am
>> saying that because, when I loop over my leaky code, valgrind always
>> report the same amount of leaky memory, independently of the number of
>> iterations (while top is telling me that memory use is growing!).[/color]
>
> There are various things (interned strings, f'ex) that always tend to
> be alive at the end of a Python program: these are only leaks in a
> very warped sense.
>
> I don't know if there's a way to get vaglrind to tell you what's
> allocated but not deallocated between two arbitrary points of program
> execution.
>[color=green]
>> My code uses extension modules in C, so I am afraid this does not
>> contribute to alleviate the problem.[/color]
>
> Well, in all likelyhood, the bug is IN the C extension module. Have
> you tried stepping through the code in a debugger? Sometime's that's
> a good way of spotting a logic error.
>[color=green]
>> I am sorry, but I cannot be more explicit about the code because it
>> is quite complex (it is the PyTables package, http://pytables.sf.net),
>> and I was unable to make a simple example to be published
>> here. However, if anyone is tempted to have a look at the code, you
>> can download it from
>> (http://sourceforge.net/project/showf...roup_id=63486). I am
>> attaching a unittest that exposes the leak.
>>
>> I am a bit desperate. Any hint?[/color]
>
> Not really. Try using TrackRefs.
>
> Cheers,
> mwh
>[/color]

**Francesc Alted** · Jul 18 '05, 02:08 AM

Re: Hunting a memory leak

[Ooops. Something went wrong with my newsreader config ;-)]

Thanks for the responses!. I started by fetching the TrackRefs() class
from http://cvs.zope.org/Zope3/test.py and pasted it in my local copy
of unittest.py. Then, I've modified the TestCase.__call __ try: block
from the original:

try:
testMethod()
ok = 1

to read:

try:
rc1 = rc2 = None
#Pre-heating
for i in xrange(10):
testMethod()
gc.collect()
rc1 = sys.gettotalref count()
track = TrackRefs()
# Second (first "valid") loop
for i in xrange(10):
testMethod()
gc.collect()
rc2 = sys.gettotalref count()
print "First output of TrackRefs:"
track.update()
print >>sys.stderr, "%5d %s.%s.%s()" % (rc2-rc1,
testMethod.__mo dule__, testMethod.im_c lass.__name__,
testMethod.im_f unc.__name__)
# Third loop
for i in xrange(10):
testMethod()
gc.collect()
rc3 = sys.gettotalref count()
print "Second output of TrackRefs:"
track.update()
print >>sys.stderr, "%5d %s.%s.%s()" % (rc3-rc2,
testMethod.__mo dule__, testMethod.im_c lass.__name__,
testMethod.im_f unc.__name__)
ok = 1

However, I'm not sure if I have made a good implementation. My
understanding is that the first loop is for pre-heating (to avoid
false count-refs due to cache issues and so). The second loop should
already give good count references and, thereby, I've made a call to
track.update(). Finally, I wanted to re-check the results of the
second loop with a third one. Therefore, I expected more or less the
same results in second and third loops.

But... the results are different!. Following are the results of this run:

$ python2.3 widetree3.py
First output of TrackRefs:
<type 'str'> 13032 85335
<type 'tuple'> 8969 38402
<type 'Cfunc'> 1761 11931
<type 'code'> 1215 4871
<type 'function'> 1180 5189
<type 'dict'> 841 4897
<type 'builtin_functi on_or_method'> 516 2781
<type 'int'> 331 3597
<type 'wrapper_descri ptor'> 295 1180
<type 'method_descrip tor'> 236 944
<type 'classobj'> 145 1092
<type 'module'> 107 734
<type 'list'> 94 440
<type 'type'> 86 1967
<type 'getset_descrip tor'> 84 336
<type 'weakref'> 75 306
<type 'float'> 73 312
<type 'member_descrip tor'> 70 280
<type 'ufunc'> 52 364
<type 'instance'> 42 435
<type 'instancemethod '> 41 164
<class 'numarray.ufunc ._BinaryUFunc'> 25 187
<class 'numarray.ufunc ._UnaryUFunc'> 24 173
<type 'frame'> 9 44
<type 'long'> 7 28
<type 'property'> 6 25
<type 'PyCObject'> 4 20
<class 'unittest.TestS uite'> 3 31
<type 'file'> 3 23
<type 'listiterator'> 3 12
<type 'bool'> 2 41
<class 'random.Random' > 2 30
<type '_sre.SRE_Patte rn'> 2 9
<type 'complex'> 2 8
<type 'thread.lock'> 2 8
<type 'NoneType'> 1 2371
<class 'unittest._Text TestResult'> 1 16
<type 'ellipsis'> 1 12
<class '__main__.WideT reeTestCase'> 1 11
<class 'tables.IsDescr iption.metaIsDe scription'> 1 10
<class 'unittest.TestP rogram'> 1 9
<class 'numarray.ufunc ._ChooseUFunc'> 1 8
<class 'unittest.TestL oader'> 1 7
<class 'unittest.Track Refs'> 1 6
<class 'unittest.TextT estRunner'> 1 6
<type 'NotImplemented Type'> 1 6
<class 'numarray.ufunc ._PutUFunc'> 1 5
<class 'numarray.ufunc ._TakeUFunc'> 1 5
<class 'unittest._Writ elnDecorator'> 1 5
<type 'staticmethod'> 1 4
<type 'classmethod'> 1 4
<type 'classmethod_de scriptor'> 1 4
<type 'unicode'> 1 4
7 __main__.WideTr eeTestCase.test 00_Leafs()
Second output of TrackRefs:
<type 'int'> 37 218
<type 'type'> 0 74
212 __main__.WideTr eeTestCase.test 00_Leafs()
..
----------------------------------------------------------------------
Ran 1 test in 0.689s

OK
[21397 refs]
$

As you can see, for the second loop (first output of TrackRefs), a lot
of objects appear, but after the third loop (second output of
TrackRefs), much less appear (only objects of type "int" and
"type"). Besides, the increment of the total references for the second
loop is only 7 while for the third loop is 212. Finally, to add even
more confusion, these numbers are *totally* independent of the number
of iterations I put in the loops. You see 10 in the code, but you can
try with 100 (in one or all the loops) and you get exactly the same
figures.

I definitely think that I have made a bad implementation of the try:
code block, but I can't figure out what's going wrong.

I would appreciate some ideas.

Francesc Alted

**Edward K. Ream** · Jul 18 '05, 02:09 AM

Re: Hunting a memory leak

> I would appreciate some ideas.

I doubt many people will be willing to rummage through your app's code to do
your debugging for you. Here are two general ideas:

1. Try to simplify the problem. Pick something, no matter how small (and
the smaller the better) that doesn't seem to be correct and do what it takes
to find out why it isn't correct. If trackRefs is Python code you can hack
that code to give you more (or less!) info. Once you discover the answer to
one mystery, the larger mysteries may become clearer. For example, you can
concentrate on one particular data structure, one particular data type or
one iteration of your test suite.

2. Try to enjoy the problem. The late great Earl Nightingale had roughly
this advice: Don't worry. Simply consider the problem calmly, and have
confidence that the solution will eventually come to you, probably when you
are least expecting it. I've have found that this advice really works, and
it works for almost any problem. Finding "worthy" bugs is a creative
process, and creativity can be and should be highly enjoyable.

In this case, your problem is: "how to start finding my memory leaks".
Possible answers to this problem might be various strategies for getting
more (or more focused!) information. Then you have new problems: how to
implement the various strategies. In all cases, the advice to be calm and
patient applies. Solving this problem will be highly valuable to you, no
matter how long it takes :-)

Edward

P.S. And don't hesitate to ask more questions, especially once you have more
concrete data or mysteries.

EKR
--------------------------------------------------------------------
Edward K. Ream email: edreamleo@chart er.net
Leo: Literate Editor with Outlines
Leo: http://webpages.charter.net/edreamleo/front.html
--------------------------------------------------------------------

**Francesc Alted** · Jul 18 '05, 02:10 AM

Re: Hunting a memory leak

On 2003-08-30, Edward K. Ream <edreamleo@char ter.net> wrote:[color=blue][color=green]
>> I would appreciate some ideas.[/color]
>
> I doubt many people will be willing to rummage through your app's code to do
> your debugging for you. Here are two general ideas:[/color]

Thanks for the words of encouragement. After the weekend I'm more fresh and
try to follow your suggestions (and those of Earl Nightingale ;-).

Cheers,

Francesc Alted

**Michael Hudson** · Jul 18 '05, 02:11 AM

Re: Hunting a memory leak

Francesc Alted <falted@openlc. org> writes:
[color=blue]
>
> As you can see, for the second loop (first output of TrackRefs), a lot
> of objects appear, but after the third loop (second output of
> TrackRefs), much less appear (only objects of type "int" and
> "type"). Besides, the increment of the total references for the second
> loop is only 7 while for the third loop is 212. Finally, to add even
> more confusion, these numbers are *totally* independent of the number
> of iterations I put in the loops. You see 10 in the code, but you can
> try with 100 (in one or all the loops) and you get exactly the same
> figures.
>
> I definitely think that I have made a bad implementation of the try:
> code block, but I can't figure out what's going wrong.
>
> I would appreciate some ideas.[/color]

In my experience of hunting these you want to call gc.collect() and
track.update() *inside* the loops. Other functions you might want to
call are things like sre.purge(), _strptime.clear _cache(),
linecache.clear cache()... there's a seemingly unbounded number of
caches around that can interfere.

Cheers,
mwh

--
A difference which makes no difference is no difference at all.
-- William James (I think. Reference anyone?)

**Will Ware** · Jul 18 '05, 02:11 AM

Re: Hunting a memory leak

Debian User wrote:[color=blue]
> I'm trying to discover a memory leak on a program of mine...[/color]

Several years ago, I came up with a memory leak detector that I used for
C extensions with Python 1.5.2. This was before there were gc.* methods
available, and I'm guessing they probably do roughly the same things.
Still, in the unlikely event it's helpful:

Attention Required! | Cloudflare

http://www.faqts.com/knowledge_base/view.phtml/aid/6006

Now that I think of it, this might be helpful after all. With this
approach, you're checking the total refcount at various points in the
loop in your C code, rather than only in the Python code. Take a look
anyway.

Good luck
Will Ware

**Francesc Alted** · Jul 18 '05, 02:13 AM

Re: Hunting a memory leak

Edward K. Ream wrote:
[color=blue]
> Here are two general ideas:
>
> 1. Try to simplify the problem. Pick something, no matter how small (and
> the smaller the better) that doesn't seem to be correct and do what it
> takes to find out why it isn't correct.[/color]

Yeah... using this approach I was finally able to hunt the leak!!!.

The problem was hidden in C code that is used to access to a C library. I'm
afraid that valgrind was unable to detect that because the underlying C
library does not call the standard malloc to create the leaking objects.

Of course, the Python reference counters were unable to detect that as well
(although some black points still remain, but not very important).

Anyway, thanks very much for the advices and encouragement!

Francesc Alted

**Edward K. Ream** · Jul 18 '05, 02:13 AM

Re: Hunting a memory leak

> Yeah... using this approach I was finally able to hunt the leak!!!.
....[color=blue]
> Anyway, thanks very much for the advices and encouragement![/color]

You are welcome. IMO, if you can track down memory problems in C you can
debug just about anything, with the notable exception of numeric programs.
Debugging numeric calculations is hard, and will always remain so.

Edward
--------------------------------------------------------------------
Edward K. Ream email: edreamleo@chart er.net
Leo: Literate Editor with Outlines
Leo: http://webpages.charter.net/edreamleo/front.html
--------------------------------------------------------------------

Hunting a memory leak

Hunting a memory leak

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment