Memory leak/gc.get_objects()/Improved gc in version 2.5

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • crazy420fingers@gmail.com

    Memory leak/gc.get_objects()/Improved gc in version 2.5

    I'm running a python program that simulates a wireless network
    protocol for a certain number of "frames" (measure of time). I've
    observed the following:

    1. The memory consumption of the program grows as the number of frames
    I simulate increases.

    To verify this, I've used two methods, which I invoke after every
    frame simulated:

    -- Parsing the /proc/<pid>/status file as in:

    -- Using ps vg | grep python | awk '!/grep/ {print " ",$8}' in an
    os.system() call.

    The memory usage vs. frame number graph shows some big "jumps" at
    certain points, and, after a large number of frames, shows a steady
    upward slope

    2. I think I've verified that the objects I instantiate are actually
    freed-- I'm therefore assuming that this "leak" is "caused" by
    python's garbage collection mechanism. I count the number of objects I
    generate that are being tracked by gc as follows:

    gc.collect()
    objCount = {}
    objList = gc.get_objects( )
    for obj in objList:
    if getattr(obj, "__class__" , None):
    name = obj.__class__._ _name__
    if objCount.has_ke y(name):
    objCount[name] += 1
    else:
    objCount[name] = 1

    for name in objCount:
    print name, " :", objCount[name]

    del objList

    Running this snippet every hundred frames or so, shows that the number
    of objects managed by gc is not growing.

    I upgraded to Python 2.5. in an attempt to solve this problem. The
    only change in my observations from version 2.4 is that the absolute
    memory usage level seems to have dropped. However, I still see the
    jumps in memory usage at the same points in time.

    Can anybody explain why the memory usage shows significant jumps (~200
    kB or ~500 kb) over time (i.e. "frames") even though there is no
    apparent increase in the objects managed by gc? Note that I'm calling
    gc.collect() regularly.

    Thanks for your attention,

    Arvind

  • Terry Reedy

    #2
    Re: Memory leak/gc.get_objects( )/Improved gc in version 2.5


    <crazy420finger s@gmail.comwrot e in message
    news:1191890643 .412806.121180@ r29g2000hsg.goo glegroups.com.. .

    Questions like this about memory consumption should start with the
    information printed by the interactive interpreter on startup and
    additional info about whether the binary is from stock CPython or has 3rd
    party modules compiled in. The latter are typically the source of real
    problems.



    Comment

    • Chris Mellon

      #3
      Re: Memory leak/gc.get_objects( )/Improved gc in version 2.5

      On 10/8/07, crazy420fingers @gmail.com <crazy420finger s@gmail.comwrot e:
      I'm running a python program that simulates a wireless network
      protocol for a certain number of "frames" (measure of time). I've
      observed the following:
      >
      1. The memory consumption of the program grows as the number of frames
      I simulate increases.
      >
      To verify this, I've used two methods, which I invoke after every
      frame simulated:
      >
      -- Parsing the /proc/<pid>/status file as in:

      -- Using ps vg | grep python | awk '!/grep/ {print " ",$8}' in an
      os.system() call.
      >
      The memory usage vs. frame number graph shows some big "jumps" at
      certain points, and, after a large number of frames, shows a steady
      upward slope
      >
      This would be expected if you're creating ever-larger amounts of
      objects - python uses memory pools and as the number of simultaneous
      objects increases, the size of the pool will need to increase. This
      isn't expected if the total number of objects you create is pretty
      much static, but the way you're trying to determine that is flawed
      (see below).
      2. I think I've verified that the objects I instantiate are actually
      freed-- I'm therefore assuming that this "leak" is "caused" by
      python's garbage collection mechanism. I count the number of objects I
      generate that are being tracked by gc as follows:
      >
      gc.collect()
      objCount = {}
      objList = gc.get_objects( )
      for obj in objList:
      if getattr(obj, "__class__" , None):
      name = obj.__class__._ _name__
      if objCount.has_ke y(name):
      objCount[name] += 1
      else:
      objCount[name] = 1
      >
      for name in objCount:
      print name, " :", objCount[name]
      >
      del objList
      >
      Running this snippet every hundred frames or so, shows that the number
      of objects managed by gc is not growing.
      >
      I upgraded to Python 2.5. in an attempt to solve this problem. The
      only change in my observations from version 2.4 is that the absolute
      memory usage level seems to have dropped. However, I still see the
      jumps in memory usage at the same points in time.
      >
      Can anybody explain why the memory usage shows significant jumps (~200
      kB or ~500 kb) over time (i.e. "frames") even though there is no
      apparent increase in the objects managed by gc? Note that I'm calling
      gc.collect() regularly.
      >
      You're misunderstandin g the purpose of Pythons GC. Python is
      refcounted. The GC exists only to find and break reference cycles. If
      you don't have ref cycles, the GC doesn't do anything and you could
      just turn it off.

      gc.get_objects( ) is a snapshot of the currently existing objects, and
      won't give you any information about peak object count, which is the
      most direct correlation to total memory use.

      Comment

      • arvind

        #4
        Re: Memory leak/gc.get_objects( )/Improved gc in version 2.5

        On Oct 9, 7:54 am, "Chris Mellon" <arka...@gmail. comwrote:
        On 10/8/07, crazy420fing... @gmail.com <crazy420fing.. .@gmail.comwrot e:
        >
        >
        >
        I'm running a python program that simulates a wireless network
        protocol for a certain number of "frames" (measure of time). I've
        observed the following:
        >
        1. The memory consumption of the program grows as the number of frames
        I simulate increases.
        >
        To verify this, I've used two methods, which I invoke after every
        frame simulated:
        >
        -- Parsing the /proc/<pid>/status file as in:

        -- Using ps vg | grep python | awk '!/grep/ {print " ",$8}' in an
        os.system() call.
        >
        The memory usage vs. frame number graph shows some big "jumps" at
        certain points, and, after a large number of frames, shows a steady
        upward slope
        >
        This would be expected if you're creating ever-larger amounts of
        objects - python uses memory pools and as the number of simultaneous
        objects increases, the size of the pool will need to increase. This
        isn't expected if the total number of objects you create is pretty
        much static, but the way you're trying to determine that is flawed
        (see below).
        >
        >
        >
        2. I think I've verified that the objects I instantiate are actually
        freed-- I'm therefore assuming that this "leak" is "caused" by
        python's garbage collection mechanism. I count the number of objects I
        generate that are being tracked by gc as follows:
        >
        gc.collect()
        objCount = {}
        objList = gc.get_objects( )
        for obj in objList:
        if getattr(obj, "__class__" , None):
        name = obj.__class__._ _name__
        if objCount.has_ke y(name):
        objCount[name] += 1
        else:
        objCount[name] = 1
        >
        for name in objCount:
        print name, " :", objCount[name]
        >
        del objList
        >
        Running this snippet every hundred frames or so, shows that the number
        of objects managed by gc is not growing.
        >
        I upgraded to Python 2.5. in an attempt to solve this problem. The
        only change in my observations from version 2.4 is that the absolute
        memory usage level seems to have dropped. However, I still see the
        jumps in memory usage at the same points in time.
        >
        Can anybody explain why the memory usage shows significant jumps (~200
        kB or ~500 kb) over time (i.e. "frames") even though there is no
        apparent increase in the objects managed by gc? Note that I'm calling
        gc.collect() regularly.
        >
        You're misunderstandin g the purpose of Pythons GC. Python is
        refcounted. The GC exists only to find and break reference cycles. If
        you don't have ref cycles, the GC doesn't do anything and you could
        just turn it off.
        >
        gc.get_objects( ) is a snapshot of the currently existing objects, and
        won't give you any information about peak object count, which is the
        most direct correlation to total memory use.
        >
        Thanks for your attention,
        >
        Arvind
        >
        Chris,

        Thanks for your reply.

        To answer the earlier question, I used CPython 2.4.3 and ActivePython
        2.5.1 in my analysis above. No custom modules added. Interpreter
        banners are at the end of this message.

        In my program, I do keep instantiating new objects every "frame".
        However, these objects are no longer needed after a few frames, and
        the program no longer maintains a reference to old objects. Therefore,
        I expect the reference-counting mechanism built into python (whatever
        it is, if not gc) to free memory used by these objects and return it
        to the "pool" from which they were allocated. Further, I would expect
        that in time, entire pools would become free, and these free pools
        should be reused for new objects. Therefore the total number of pools
        allocated (and therefore "arenas"?) should not grow over time, if
        pools are being correctly reclaimed. Is this not expected behavior?

        Also, since I sample gc.get_objects( ) frequently, I would expect that
        I would stumble upon a "peak" memory usage snapshot, or at the very
        least see a good bit of variation in the output. However, this does
        not occur.

        Finally, if I deliberately hold on to references to old objects,
        gc.get_objects( ) clearly shows an increasing number of objects being
        tracked in each snapshot, and the memory leak is well explained.

        Python version info:

        ActivePython 2.5.1.1 (ActiveState Software Inc.) based on
        Python 2.5.1 (r251:54863, May 2 2007, 08:46:07)
        [GCC 3.3.4 (pre 3.3.5 20040809)] on linux2

        AND

        Python 2.4.3 (#1, Mar 14 2007, 19:01:42)
        [GCC 4.1.1 20070105 (Red Hat 4.1.1-52)] on linux2


        Arvind

        Comment

        Working...