Process "Killed"

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • dieter

    Process "Killed"

    Hi,

    Overview
    =======

    I'm doing some simple file manipulation work and the process gets
    "Killed" everytime I run it. No traceback, no segfault... just the
    word "Killed" in the bash shell and the process ends. The first few
    batch runs would only succeed with one or two files being processed
    (out of 60) before the process was "Killed". Now it makes no
    successful progress at all. Just a little processing then "Killed".


    Question
    =======

    Any Ideas? Is there a buffer limitation? Do you think it could be the
    filesystem?
    Any suggestions appreciated.... Thanks.


    The code I'm running:
    =============== ===

    from glob import glob

    def manipFiles():
    filePathList = glob('/data/ascii/*.dat')
    for filePath in filePathList:
    f = open(filePath, 'r')
    lines = f.readlines()[2:]
    f.close()
    f = open(filePath, 'w')
    f.writelines(li nes)
    f.close()
    print file


    Sample lines in File:
    =============== =

    # time, ap, bp, as, bs, price, vol, size, seq, isUpLast, isUpVol,
    isCancel

    1062993789 0 0 0 0 1022.75 1 1 0 1 0 0
    1073883668 1120 1119.75 28 33 0 0 0 0 0 0 0


    Other Info
    ========

    - The file sizes range from 76 Kb to 146 Mb
    - I'm running on a Gentoo Linux OS
    - The filesystem is partitioned and using: XFS for the data
    repository, Reiser3 for all else.
  • Matt Nordhoff

    #2
    Re: Process "Killed&qu ot;

    dieter wrote:
    Hi,
    >
    Overview
    =======
    >
    I'm doing some simple file manipulation work and the process gets
    "Killed" everytime I run it. No traceback, no segfault... just the
    word "Killed" in the bash shell and the process ends. The first few
    batch runs would only succeed with one or two files being processed
    (out of 60) before the process was "Killed". Now it makes no
    successful progress at all. Just a little processing then "Killed".
    That isn't a Python thing. Run "sleep 60" in one shell, then "kill -9"
    the process in another shell, and you'll get the same message.

    I know my shared web host has a daemon that does that to processes that
    consume too many resources.

    Wait a minute. If you ran this multiple times, won't it have removed the
    first two lines from the first files multiple times, deleting some data
    you actually care about? I hope you have backups...
    Question
    =======
    >
    Any Ideas? Is there a buffer limitation? Do you think it could be the
    filesystem?
    Any suggestions appreciated.... Thanks.
    >
    >
    The code I'm running:
    =============== ===
    >
    from glob import glob
    >
    def manipFiles():
    filePathList = glob('/data/ascii/*.dat')
    If that dir is very large, that could be slow. Both because glob will
    run a regexp over every filename, and because it will return a list of
    every file that matches.

    If you have Python 2.5, you could use glob.iglob() instead of
    glob.glob(), which returns an iterator instead of a list.
    for filePath in filePathList:
    f = open(filePath, 'r')
    lines = f.readlines()[2:]
    This reads the entire file into memory. Even better, I bet slicing
    copies the list object temporarily, before the first one is destroyed.
    f.close()
    f = open(filePath, 'w')
    f.writelines(li nes)
    f.close()
    print file
    This is unrelated, but "print file" will just say "<type 'file'>",
    because it's the name of a built-in object, and you didn't assign to it
    (which you shouldn't anyway).


    Actually, if you *only* ran that exact code, it should exit almost
    instantly, since it does one import, defines a function, but doesn't
    actually call anything. ;-)
    Sample lines in File:
    =============== =
    >
    # time, ap, bp, as, bs, price, vol, size, seq, isUpLast, isUpVol,
    isCancel
    >
    1062993789 0 0 0 0 1022.75 1 1 0 1 0 0
    1073883668 1120 1119.75 28 33 0 0 0 0 0 0 0
    >
    >
    Other Info
    ========
    >
    - The file sizes range from 76 Kb to 146 Mb
    - I'm running on a Gentoo Linux OS
    - The filesystem is partitioned and using: XFS for the data
    repository, Reiser3 for all else.
    How about this version? (note: untested)

    import glob
    import os

    def manipFiles():
    # If you don't have Python 2.5, use "glob.glob" instead.
    filePaths = glob.iglob('/data/ascii/*.dat')
    for filePath in filePaths:
    print filePath
    fin = open(filePath, 'rb')
    fout = open(filePath + '.out', 'wb')
    # Discard two lines
    fin.next(); fin.next()
    fout.writelines (fin)
    fin.close()
    fout.close()
    os.rename(fileP ath + '.out', filePath)

    I don't know how light it will be on CPU, but it should use very little
    memory (unless you have some extremely long lines, I guess). You could
    write a version that just used .read() and .write() in chunks

    Also, it temporarily duplicates "whatever.d at" to "whatever.dat.o ut",
    and if "whatever.dat.o ut" already exists, it will blindly overwrite it.

    Also, if this is anything but a one-shot script, you should use
    "try...fina lly" statements to make sure the file objects get closed (or,
    in Python 2.5, the "with" statement).
    --

    Comment

    • Glenn Hutchings

      #3
      Re: Process &quot;Killed&qu ot;

      dieter <vel.accel@gmai l.comwrites:
      I'm doing some simple file manipulation work and the process gets
      "Killed" everytime I run it. No traceback, no segfault... just the
      word "Killed" in the bash shell and the process ends. The first few
      batch runs would only succeed with one or two files being processed
      (out of 60) before the process was "Killed". Now it makes no
      successful progress at all. Just a little processing then "Killed".
      >
      Any Ideas? Is there a buffer limitation? Do you think it could be the
      filesystem?
      Any suggestions appreciated.... Thanks.
      >
      The code I'm running:
      =============== ===
      >
      from glob import glob
      >
      def manipFiles():
      filePathList = glob('/data/ascii/*.dat')
      for filePath in filePathList:
      f = open(filePath, 'r')
      lines = f.readlines()[2:]
      f.close()
      f = open(filePath, 'w')
      f.writelines(li nes)
      f.close()
      print file
      Have you checked memory usage while your program is running? Your

      lines = f.readlines()[2:]

      statement will need almost twice the memory of your largest file. This
      might be a problem, depending on your RAM and what else is running at the
      same time.

      If you want to reduce memory usage to almost zero, try reading lines from
      the file and writing all but the first two to a temporary file, then
      renaming the temp file to the original:

      import os

      infile = open(filePath, 'r')
      outfile = open(filePath + '.bak', 'w')

      for num, line in enumerate(infil e):
      if num >= 2:
      outfile.write(l ine)

      infile.close()
      outfile.close()
      os.rename(fileP ath + '.bak', filePath)

      Glenn

      Comment

      • Paul Boddie

        #4
        Re: Process &quot;Killed&qu ot;

        On 28 Aug, 07:30, dieter <vel.ac...@gmai l.comwrote:
        >
        I'm doing some simple file manipulation work and the process gets
        "Killed" everytime I run it. No traceback, no segfault... just the
        word "Killed" in the bash shell and the process ends. The first few
        batch runs would only succeed with one or two files being processed
        (out of 60) before the process was "Killed". Now it makes no
        successful progress at all. Just a little processing then "Killed".
        It might be interesting to check the various limits in your shell. Try
        this command:

        ulimit -a

        Documentation can found in the bash manual page. The limits include
        memory size, CPU time, open file descriptors, and a few other things.

        Paul

        Comment

        • Fredrik Lundh

          #5
          Re: Process &quot;Killed&qu ot;

          dieter wrote:
          Any Ideas? Is there a buffer limitation? Do you think it could be the
          filesystem?
          what does "ulimit -a" say?

          </F>

          Comment

          • Fredrik Lundh

            #6
            Re: Process &quot;Killed&qu ot;

            Glenn Hutchings wrote:
            Have you checked memory usage while your program is running? Your
            >
            lines = f.readlines()[2:]
            >
            statement will need almost twice the memory of your largest file.
            footnote: list objects contain references to string objects, not the
            strings themselves. the above temporarily creates two list objects, but
            the actual file content is only stored once.

            </F>

            Comment

            • Eric Wertman

              #7
              Re: Process &quot;Killed&qu ot;

              I'm doing some simple file manipulation work and the process gets
              "Killed" everytime I run it. No traceback, no segfault... just the
              word "Killed" in the bash shell and the process ends. The first few
              batch runs would only succeed with one or two files being processed
              (out of 60) before the process was "Killed". Now it makes no
              successful progress at all. Just a little processing then "Killed".
              This is the behavior you'll see when your os has run out of some
              memory resource. The kernel sends a 9 signal. I'm pretty sure that
              if you exceed a soft limit your program will abort with out of memory
              error.

              Eric

              Comment

              • dieter h

                #8
                Re: Process &quot;Killed&qu ot;

                On Sat, Aug 30, 2008 at 11:07 AM, Eric Wertman <ewertman@gmail .comwrote:
                >I'm doing some simple file manipulation work and the process gets
                >"Killed" everytime I run it. No traceback, no segfault... just the
                >word "Killed" in the bash shell and the process ends. The first few
                >batch runs would only succeed with one or two files being processed
                >(out of 60) before the process was "Killed". Now it makes no
                >successful progress at all. Just a little processing then "Killed".
                >
                This is the behavior you'll see when your os has run out of some
                memory resource. The kernel sends a 9 signal. I'm pretty sure that
                if you exceed a soft limit your program will abort with out of memory
                error.
                >
                Eric
                >
                Eric, thank you very much for your response.

                Comment

                Working...