Memory leak in Python

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • diffuser78@gmail.com

    Memory leak in Python

    I have a python code which is running on a huge data set. After
    starting the program the computer becomes unstable and gets very
    diffucult to even open konsole to kill that process. What I am assuming
    is that I am running out of memory.

    What should I do to make sure that my code runs fine without becoming
    unstable. How should I address the memory leak problem if any ? I have
    a gig of RAM.

    Every help is appreciated.

  • compromise@gmail.com

    #2
    Re: Memory leak in Python

    Can you paste an example of the code you're using?

    Comment

    • vbgunz

      #3
      Re: Memory leak in Python

      how big is the set? 100MB, more? what are you doing with the set? do
      you have a small example that can prove the set is causing the freeze?
      I am not the sharpest tool in the shed but it sounds like you might be
      multiplying your set in/directly either permanently or temporarily on
      purpose or accident.

      Comment

      • diffuser78@gmail.com

        #4
        Re: Memory leak in Python

        Its kinda 65o lines of code...not the best idea to paste the code.
        compromise@gmai l.com wrote:[color=blue]
        > Can you paste an example of the code you're using?[/color]

        Comment

        • Sybren Stuvel

          #5
          Re: Memory leak in Python

          diffuser78@gmai l.com enlightened us with:[color=blue]
          > I have a python code which is running on a huge data set. After
          > starting the program the computer becomes unstable and gets very
          > diffucult to even open konsole to kill that process. What I am
          > assuming is that I am running out of memory.[/color]

          Before acting on your assumptions, you need to verify them. Run 'top'
          and hit 'M' to sort by memory usage. After that, use 'ulimit' to limit
          the allowed memory usage, run your program again, and see if it stops
          at some point due to memory problems.

          Sybren
          --
          The problem with the world is stupidity. Not saying there should be a
          capital punishment for stupidity, but why don't we just take the
          safety labels off of everything and let the problem solve itself?
          Frank Zappa

          Comment

          • Peter Tillotson

            #6
            Re: Memory leak in Python

            1) Review your design - You say you are processing a large data set,
            just make sure you are not trying to store 3 versions. If you are
            missing a design, create a flow chart or something that is true to the
            code you have produced. You could probably even post the design if you
            are brave enough.

            2) Check your implementation - make sure you manage lists, arrays etc
            correctly. You need to sever links (references) to objects for them to
            get swept up. I know it is obvious but easy to do in a hasty implementation.

            3) Verify and test problem characteristics , profilers, top etc. It is
            hard for us to help you much without more info. Test your assumptions.

            Problem solving and debugging is a process, not some mystic art. Though
            sometime the Gremlins disappear after a pint or two :-)

            p

            diffuser78@gmai l.com wrote:[color=blue]
            > I have a python code which is running on a huge data set. After
            > starting the program the computer becomes unstable and gets very
            > diffucult to even open konsole to kill that process. What I am assuming
            > is that I am running out of memory.
            >
            > What should I do to make sure that my code runs fine without becoming
            > unstable. How should I address the memory leak problem if any ? I have
            > a gig of RAM.
            >
            > Every help is appreciated.
            >[/color]

            Comment

            • bruno at modulix

              #7
              Re: Memory leak in Python

              diffuser78@gmai l.com wrote:[color=blue]
              > I have a python code which is running on a huge data set. After
              > starting the program the computer becomes unstable and gets very
              > diffucult to even open konsole to kill that process. What I am assuming
              > is that I am running out of memory.
              >
              > What should I do to make sure that my code runs fine without becoming
              > unstable. How should I address the memory leak problem if any ? I have
              > a gig of RAM.
              >
              > Every help is appreciated.[/color]

              Just a hint : if you're trying to load your whole "huge data set" in
              memory, you're in for trouble whatever the language - for an example,
              doing a 'buf = openedFile.read ()' on a 100 gig file may not be a good
              idea...



              --
              bruno desthuilliers
              python -c "print '@'.join(['.'.join([w[::-1] for w in p.split('.')]) for
              p in 'onurb@xiludom. gro'.split('@')])"

              Comment

              • diffuser78@gmail.com

                #8
                Re: Memory leak in Python

                I am using Ubuntu Linux.

                My program is a simulation program with four classes and it mimics bit
                torrent file sharing systems on 2000 nodes. Now, each node has lot of
                attributes and my program kinds of tries to keep tab of everything. As
                I mentioned its a simulation program, it starts at time T=0 and goes on
                untill all nodes have recieved all parts of the file(BitTorrent
                concept). The ending time goes to thousands of seconds. In each sec I
                process all the 2000 nodes.

                Psuedo Code

                Time = 0
                while (True){
                For all nodes in the system{
                Process + computation
                }
                Time++
                If (DownloadFinish ed == True) exit;
                }


                Dennis Lee Bieber wrote:[color=blue]
                > On 8 May 2006 18:15:02 -0700, diffuser78@gmai l.com declaimed the
                > following in comp.lang.pytho n:
                >[color=green]
                > > I have a python code which is running on a huge data set. After
                > > starting the program the computer becomes unstable and gets very
                > > diffucult to even open konsole to kill that process. What I am assuming
                > > is that I am running out of memory.
                > >
                > > What should I do to make sure that my code runs fine without becoming
                > > unstable. How should I address the memory leak problem if any ? I have
                > > a gig of RAM.
                > >[/color]
                > Does the memory come back after the process exits?
                >
                > You don't show any sample of code or data... Nor do you mention what
                > OS/processor is involved.
                >
                > Many systems do not return /allocated/ memory to the OS until the
                > top-level process exits, even if the memory is "freed" from the
                > viewpoint of the process.
                > --
                > Wulfraed Dennis Lee Bieber KD6MOG
                > wlfraed@ix.netc om.com wulfraed@bestia ria.com
                > HTTP://wlfraed.home.netcom.com/
                > (Bestiaria Support Staff: web-asst@bestiaria. com)
                > HTTP://www.bestiaria.com/[/color]

                Comment

                • Sybren Stuvel

                  #9
                  Re: Memory leak in Python

                  diffuser78@gmai l.com enlightened us with:[color=blue]
                  > My program is a simulation program with four classes and it mimics
                  > bittorrent file sharing systems on 2000 nodes.[/color]

                  Wouldn't it be better to use an existing simulator? That way, you
                  won't have to do the stuff you don't want to think about, and focus on
                  the more interesting parts. There are plenty of discrete-event and
                  discrete-time simulators to choose from.

                  Sybren
                  --
                  The problem with the world is stupidity. Not saying there should be a
                  capital punishment for stupidity, but why don't we just take the
                  safety labels off of everything and let the problem solve itself?
                  Frank Zappa

                  Comment

                  • Serge Orlov

                    #10
                    Re: Memory leak in Python

                    diffuser78@gmai l.com wrote:[color=blue]
                    > I am using Ubuntu Linux.
                    >
                    > My program is a simulation program with four classes and it mimics bit
                    > torrent file sharing systems on 2000 nodes. Now, each node has lot of
                    > attributes and my program kinds of tries to keep tab of everything. As
                    > I mentioned its a simulation program, it starts at time T=0 and goes on
                    > untill all nodes have recieved all parts of the file(BitTorrent
                    > concept). The ending time goes to thousands of seconds. In each sec I
                    > process all the 2000 nodes.[/color]

                    Most likely you keep references to objects you don't need, so python
                    garbage collector cannot remove those objects. If you cannot figure it
                    out looking at the source code, you can gather some statistics to help
                    you, for example use module gc to iterate over all objects in your
                    program (gc.get_objects ()) and find out objects of which type are
                    growing with each iteration.

                    Comment

                    • bruno at modulix

                      #11
                      Re: Memory leak in Python

                      diffuser78@gmai l.com wrote:
                      (top-post corrected)[color=blue]
                      >
                      > bruno at modulix wrote:
                      >[color=green]
                      >>diffuser78@gm ail.com wrote:
                      >>[color=darkred]
                      >>>I have a python code which is running on a huge data set. After
                      >>>starting the program the computer becomes unstable and gets very
                      >>>diffucult to even open konsole to kill that process. What I am assuming
                      >>>is that I am running out of memory.
                      >>>
                      >>>What should I do to make sure that my code runs fine without becoming
                      >>>unstable. How should I address the memory leak problem if any ? I have
                      >>>a gig of RAM.
                      >>>
                      >>>Every help is appreciated.[/color]
                      >>
                      >>Just a hint : if you're trying to load your whole "huge data set" in
                      >>memory, you're in for trouble whatever the language - for an example,
                      >>doing a 'buf = openedFile.read ()' on a 100 gig file may not be a good
                      >>idea...
                      >>[/color]
                      >
                      > The amount of data I read in is actually small.[/color]

                      So the problem is probably elsewhere... Sorry, since you were talking
                      about huge dataset, the good old "read-whole-file-in-memory" antipattern
                      seemed an obvious guess.
                      [color=blue]
                      > If you see my algorithm above it deals with 2000 nodes and each node
                      > has ot of attributes.
                      >
                      > When I close the program my computer becomes stable and performs as
                      > usual. I check the performance in Performance monitor and using "top"
                      > and the total memory is being used and on top of that around half a gig
                      > swap memory is also being used.
                      >
                      > Please give some helpful pointers to overcome such memory errors.[/color]

                      A real memory leak would cause the memory usage to keep increasing as
                      long as your program is running. If this is not the case, it's not a
                      "memory error", but a design/program error. FWIW, apps like Zope can end
                      up using a whole lot of memory, but there's no known memory-leak problem
                      AFAIK. And believe me, a Zope app can end up managing a *really huge
                      lot* of objects (>= many thousands).
                      [color=blue]
                      > I revisited my code to find nothing so obvious which would let this
                      > leak happen. How to kill cross references in the program.[/color]

                      Using weakref and/or gc might help.

                      FWIW, the default memory management in Python is based on
                      reference-counting. As long as anything keeps a reference to an object,
                      this object will stay alive. If you have lot of cross-references and
                      2000+ big objects, you may effectively end up eating all the ram and
                      more. The gc module can detect and manage some cyclic references (obj A
                      has a ref on obj B which has a ref on obj A). The weakref module uses
                      'proxy' references that let reference-counting do it's job (I guess the
                      doc will be much more explicit than me).

                      Another possible improvement could be to use the flyweight design
                      pattern to share memory for some attributes :

                      - a general (while somewhat Java-oriented) explanation:


                      - two Python exemples (the second being based on the first)

                      When I wrote Equality for Python, my example didn't mention how the Card objects could actually be a terrific waste of memory. A commenter named versimilidude (great handle!) beat me to this post, ...


                      HTH
                      --
                      bruno desthuilliers
                      python -c "print '@'.join(['.'.join([w[::-1] for w in p.split('.')]) for
                      p in 'onurb@xiludom. gro'.split('@')])"

                      Comment

                      • diffuser78@gmail.com

                        #12
                        Re: Memory leak in Python

                        Sure, are there any available simulators...si nce i am modifying some
                        stuff i thought of creating one of my own. But if you know some
                        exisiting simlators , those can be of great help to me.

                        Thanks

                        Comment

                        • diffuser78@gmail.com

                          #13
                          Re: Memory leak in Python

                          With 1024 nodes it runs fine...but takes around4 hours to run on AMD
                          3100.

                          Comment

                          • diffuser78@gmail.com

                            #14
                            Re: Memory leak in Python

                            I ran simulation for 128 nodes and used the following

                            oo = gc.get_objects( )
                            print len(oo)

                            on every time step the number of objects are increasing. For 128 nodes
                            I had 1058177 objects.

                            I think I need to revisit the code and remove the references....b ut how
                            to do that. I am still a newbie coder and every help will be greatly
                            appreciated.

                            thanks

                            Comment

                            • Sybren Stuvel

                              #15
                              Re: Memory leak in Python

                              diffuser78@gmai l.com enlightened us with:[color=blue]
                              > Sure, are there any available simulators...si nce i am modifying some
                              > stuff i thought of creating one of my own. But if you know some
                              > exisiting simlators , those can be of great help to me.[/color]

                              Don't know any by name, but I'm sure you can find some on Google. Do
                              you need a discrete-event or a discrete-time simulator?

                              Sybren
                              --
                              The problem with the world is stupidity. Not saying there should be a
                              capital punishment for stupidity, but why don't we just take the
                              safety labels off of everything and let the problem solve itself?
                              Frank Zappa

                              Comment

                              Working...