Python 2.2.1 and select()

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • Derek Martin

    Python 2.2.1 and select()

    Hi kids!

    I've got some code that uses select.select() to capture all the output
    of a subprocess (both stdout and stderr, see below). This code works
    as expected on a variety of Fedora systems running Python 2.4.0, but
    on a Debian Sarge system running Python 2.2.1 it's a no-go. I'm
    thinking this is a bug in that particular version of Python, but I'd
    like to have confirmation if anyone can provide it.

    The behavior I see is this: the call to select() returns:
    [<file corresponding to sub-proc's STDOUT>] [] []

    If and only if the total amount of output is greater than the
    specified buffer size, then reading on this file hangs indefinitely.
    For what it's worth, the program whose output I need to capture with
    this generates about 17k of output to STDERR, and about 1k of output
    to STDOUT, at essentially random intervals. But I also ran it with a
    test shell script that generates roughly the same amount of output to
    each file object, alternating between STDOUT and STDERR, with the same
    results.

    Yes, I'm aware that this version of Python is quite old, but I don't
    have a great deal of control over that (though if this is indeed a
    python bug, as opposed to a problem with my implementation, it might
    provide some leverage to get it upgraded)... Thanks in advance for
    any help you can provide. The code in question (quite short) follows:

    def capture(cmd):
    buffsize = 8192
    inlist = []
    inbuf = ""
    errbuf = ""

    io = popen2.Popen3(c md, True, buffsize)
    inlist.append(i o.fromchild)
    inlist.append(i o.childerr)
    while True:
    ins, outs, excepts = select.select(i nlist, [], [])
    for i in ins:
    x = i.read()
    if not x:
    inlist.remove(i )
    else:
    if i == io.fromchild:
    inbuf += x
    if i == io.childerr:
    errbuf += x
    if not inlist:
    break
    if io.wait():
    raise FailedExitStatu s, errbuf
    return (inbuf, errbuf)

    If anyone would like, I could also provide a shell script and a main
    program one could use to test this function...

    --
    Derek D. Martin

    GPG Key ID: 0x81CFE75D


    -----BEGIN PGP SIGNATURE-----
    Version: GnuPG v1.2.1 (GNU/Linux)

    iD8DBQFH6CQSHEn ASN++rQIRAh7dAJ sFSzzE2OBAdCwC7 N0lXW4/1AvMxACfcibu
    YV8/VS3XI0Bwanc6swv EdM4=
    =D1br
    -----END PGP SIGNATURE-----

  • Noah

    #2
    Re: Python 2.2.1 and select()

    On Mar 24, 2:58 pm, Derek Martin <c...@pizzashac k.orgwrote:
    If and only if the total amount of output is greater than the
    specified buffer size, then reading on this file hangs indefinitely.
    For what it's worth, the program whose output I need to capture with
    this generates about 17k of output to STDERR, and about 1k of output
    to STDOUT, at essentially random intervals. But I also ran it with a
    test shell script that generates roughly the same amount of output to
    each file object, alternating between STDOUT and STDERR, with the same
    results.
    >
    I think this is more of a limitation with the underlying clib.
    Subprocess buffering defaults to block buffering instead of
    line buffering. You can't change this unless you can recompile
    the application you are trying to run in a subprocess or
    unless you run your subprocess in a pseudotty (pty).

    Pexpect takes care of this problem. See http://www.noah.org/wiki/Pexpect
    for more info.

    --
    Noah

    Comment

    • Gabriel Genellina

      #3
      Re: Python 2.2.1 and select()

      En Mon, 24 Mar 2008 23:03:56 -0300, Derek Martin <code@pizzashac k.org>
      escribió:
      On Mon, Mar 24, 2008 at 05:52:54PM -0700, Noah wrote:
      >On Mar 24, 2:58 pm, Derek Martin <c...@pizzashac k.orgwrote:
      If and only if the total amount of output is greater than the
      specified buffer size, then reading on this file hangs indefinitely.
      You may try using two worker threads to read both streams; this way you
      don't care about the blocking issues.

      --
      Gabriel Genellina

      Comment

      • Francesco Bochicchio

        #4
        Re: Python 2.2.1 and select()

        Il Mon, 24 Mar 2008 17:58:42 -0400, Derek Martin ha scritto:
        Hi kids!
        >
        I've got some code that uses select.select() to capture all the output
        of a subprocess (both stdout and stderr, see below). This code works as
        expected on a variety of Fedora systems running Python 2.4.0, but on a
        Debian Sarge system running Python 2.2.1 it's a no-go. I'm thinking
        this is a bug in that particular version of Python, but I'd like to have
        confirmation if anyone can provide it.
        >
        The behavior I see is this: the call to select() returns: [<file
        corresponding to sub-proc's STDOUT>] [] []
        >
        If and only if the total amount of output is greater than the specified
        buffer size, then reading on this file hangs indefinitely. For what it's
        worth, the program whose output I need to capture with this generates
        about 17k of output to STDERR, and about 1k of output to STDOUT, at
        essentially random intervals. But I also ran it with a test shell
        script that generates roughly the same amount of output to each file
        object, alternating between STDOUT and STDERR, with the same results.
        >
        Yes, I'm aware that this version of Python is quite old, but I don't
        have a great deal of control over that (though if this is indeed a
        python bug, as opposed to a problem with my implementation, it might
        provide some leverage to get it upgraded)... Thanks in advance for any
        help you can provide. The code in question (quite short) follows:
        >
        def capture(cmd):
        buffsize = 8192
        inlist = []
        inbuf = ""
        errbuf = ""
        >
        io = popen2.Popen3(c md, True, buffsize) inlist.append(i o.fromchild)
        inlist.append(i o.childerr)
        while True:
        ins, outs, excepts = select.select(i nlist, [], []) for i in ins:
        x = i.read()
        if not x:
        inlist.remove(i )
        else:
        if i == io.fromchild:
        inbuf += x
        if i == io.childerr:
        errbuf += x
        if not inlist:
        break
        if io.wait():
        raise FailedExitStatu s, errbuf
        return (inbuf, errbuf)
        >
        If anyone would like, I could also provide a shell script and a main
        program one could use to test this function...
        From yor description, it would seem that two events occurs:

        - there are actual data to read, but in amount less than bufsize.
        - the subsequent read waits (for wathever reason) until a full buffer can
        be read, and therefore lock your program.

        Try specifying bufsize=1 or doing read(1). If my guess is correct, you
        should not see the problem. I'm not sure that either is a good solution
        for you, since both have performance issues.

        Anyway, I doubt that the python library does more than wrapping the
        system call, so if there is a bug it is probably in the software layers
        under python.

        Ciao
        ----
        FB

        Comment

        • Derek Martin

          #5
          Re: Python 2.2.1 and select()

          On Wed, Mar 26, 2008 at 09:49:51AM -0700, Noah Spurrier wrote:
          On 2008-03-24 22:03-0400, Derek Martin wrote:
          That's an interesting thought, but I guess I'd need you to elaborate
          on how the buffering mode would affect the operation of select(). I
          really don't see how your explanation can cover this, given the
          following:
          >
          I might be completely off the mark here. I have not tested your code or even
          closely examined it. I don't mean to waste your time. I'm only giving a
          reflex response because your problem seems to exactly match a very common
          situation where someone tries to use select with a pipe to a subprocess
          created with popen and that subprocess uses C stdio.
          Yeah, you're right, more or less. I talked to someone much smarter
          than I here in the office, who pointed out that the behavior of
          Python's read() without a specified size is to attempt to read until
          EOF. This will definitely cause the read to block (if there's I/O
          waiting from STDERR), if you're allowing I/O to block... :(

          The solution is easy though...

          def set_nonblock(fd ):
          flags = fcntl.fcntl(fd, fcntl.F_GETFL)
          fcntl.fcntl(fd, fcntl.F_SETFL, flags | os.O_NONBLOCK)

          Then in the function, after calling popen:
          set_nonblock(io .fromchild.file no())
          set_nonblock(io .childerr.filen o())

          Yay for smart people.

          --
          Derek D. Martin

          GPG Key ID: 0x81CFE75D


          -----BEGIN PGP SIGNATURE-----
          Version: GnuPG v1.2.1 (GNU/Linux)

          iD8DBQFH6uNyHEn ASN++rQIRAsFpAK CMr60u03yDHsIH5 xfbs+1klWIETwCf eNDe
          ldWnh3VrcTZV7M5 RigFFfv4=
          =kY9y
          -----END PGP SIGNATURE-----

          Comment

          Working...