how to release memory?

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • fekioh
    New Member
    • Aug 2010
    • 11

    how to release memory?

    I am new in Python. I wrote a program that reads-in large text files using xreadlines, stores some values in lists (and arrays) and does some calculations. It runs fine for individual files, but when I try to consecutively process files from a folder, I get a memory error.

    my program looks like this:
    Code:
    data = fileName.xreadlines()
    for line in data:
       tokens = line.split(';')
       list1.append(tokens[2])
       list2.append(tokens[3])
       ...
       ...
    outfile.write(results)
    When I enter fileName manually and run it for one file it works fine, but when i do:

    for file in os.listdir(dir) :
    code as above

    I get a memory error after processing the first file.

    I have tried manually to delete the lists after the calculations either by aList = [] or del(aList). So, how can I free memory? platform: windows XP
    Last edited by bvdet; Aug 7 '10, 01:42 PM. Reason: Add code tags
  • bvdet
    Recognized Expert Specialist
    • Oct 2006
    • 2851

    #2
    xreadlines() has been deprecated since release Python 2.3. Iterate on the file object like this:
    Code:
    for filename in os.listdir():
        f = open(filename)
        for line in f:
            .................
    Try the above and let us know if it works for your application.

    Comment

    • fekioh
      New Member
      • Aug 2010
      • 11

      #3
      tried it but it does not work still. Very annoying that there is always a mistake after first file. And I have like 400 files. I cannot do them one by one...

      Maybe I must do something in the for loop to free the used memory after reading the first file?

      Comment

      • bvdet
        Recognized Expert Specialist
        • Oct 2006
        • 2851

        #4
        Make sure any open file objects are explicitly closed. Is it possible that there are circular references in your stored data? Python has a robust garbage collection implementation, but garbage collection is not guaranteed to happen, especially to garbage containing circular references.

        Comment

        • fekioh
          New Member
          • Aug 2010
          • 11

          #5
          Thank you but still no light... All files are closed yes. And I don't think there are any cyclic refs. Is there a way to check what is still actually stored in memory when the first loop finishes?

          Comment

          • bvdet
            Recognized Expert Specialist
            • Oct 2006
            • 2851

            #6
            One of the benefits of Python is that you should not have to worry about memory. Check out the gc module.

            Comment

            • fekioh
              New Member
              • Aug 2010
              • 11

              #7
              This must be quite straightforward . I guess I am not seeing sth.

              I had already tried gc.enable()... Now I tried print(gc.collec t())
              print(gc.collec t())


              at the end of each for loop
              and it printed:
              90
              0

              but still memory error next loop.

              Comment

              • bvdet
                Recognized Expert Specialist
                • Oct 2006
                • 2851

                #8
                That's all I can think of. Unless you post your code and some sample data, I am at a loss.

                Comment

                • fekioh
                  New Member
                  • Aug 2010
                  • 11

                  #9
                  ok here's the code.

                  The first are functions which you probably dont need to look at all. The for loop in question starts at line 180. Until line 215 I just read the first line of each file to see which columns I will need to use and then i store the columns in lists.

                  The final part (lines 240 to the end) I do some calculations on subsets of the columns and right the results to a file..

                  Thanks a lot!
                  Attached Files

                  Comment

                  • bvdet
                    Recognized Expert Specialist
                    • Oct 2006
                    • 2851

                    #10
                    I think you can simplify your code a great deal by creating one data object to hold the file data. It appears you have a simple CSV data file, and the first line contains header data. This will create one dictionary:
                    Code:
                    f = open('the_file_name')
                    keys = f.readline().strip().split(',')
                    dd = {}
                    for line in f:
                        for i, item in enumerate(line.strip().split(',')):
                            if keys[i] in parList:
                                dd.setdefault(keys[i], []).append(item)
                    f.close()
                    Each list of data items can be accessed by keyword.
                    Example: dd['CoG_X'] would return the equivalent of the list object x if I understand the code correctly.

                    When writing the results to file, I suggest you take advantage of str method join().
                    Code:
                    def write2file(alist):
                        outfile = open('autoBatch.txt','a')
                        outfile.write('\t'.join([str(item) for item in alist]))
                        outfile.write('\n')
                    Don't use list as a variable name because it masks built-in function list().

                    Try inserting print statements at strategic points in your code to see exactly where the memory error occurs. Most IDE's provide debugging tools that allow you to step through code.
                    Last edited by bvdet; Aug 8 '10, 01:27 PM.

                    Comment

                    • fekioh
                      New Member
                      • Aug 2010
                      • 11

                      #11
                      thanx a lot!

                      Comment

                      Working...