Filtering through an external process

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • Paul Rubin

    Filtering through an external process

    Anyone know if there's code around to filter text through an external
    process? Sort of like the Emacs "filter-region" command. For
    example, say I have a program that reads input in English and outputs
    it in Pig Latin. I want my Python script to call the program, pipe
    some input into it and read the output:

    english = "hello world"
    pig_latin = ext_filter("pig _latin", english)

    should set pig_latin to "ellohay orldway".

    Note that you can't just call popen2, jam the english through it and
    then read the pig latin, because the subprocess can block if you give
    it too much input before reading the output, and in general there's no
    way to know how much buffering the subprocess is willing to do. So a
    proper solution has to use asynchronous i/o and keep polling the
    output side, or else separate threads for reading and writing.

    This is something that really belongs in the standard library. I've
    needed it several times and rather than going to the trouble of coding
    and debugging it, I've always ended up using a temp file instead,
    which is a kludge.
  • Raymond Hettinger

    #2
    Re: Filtering through an external process


    "Paul Rubin" <http://phr.cx@NOSPAM.i nvalid> wrote in message
    news:7xllpyyl1s .fsf_-_@ruckus.brouha ha.com...[color=blue]
    > Anyone know if there's code around to filter text through an external
    > process? Sort of like the Emacs "filter-region" command. For
    > example, say I have a program that reads input in English and outputs
    > it in Pig Latin. I want my Python script to call the program, pipe
    > some input into it and read the output:
    >
    > english = "hello world"
    > pig_latin = ext_filter("pig _latin", english)
    >
    > should set pig_latin to "ellohay orldway".
    >
    > Note that you can't just call popen2, jam the english through it and
    > then read the pig latin, because the subprocess can block if you give
    > it too much input before reading the output, and in general there's no
    > way to know how much buffering the subprocess is willing to do. So a
    > proper solution has to use asynchronous i/o and keep polling the
    > output side, or else separate threads for reading and writing.
    >
    > This is something that really belongs in the standard library. I've
    > needed it several times and rather than going to the trouble of coding
    > and debugging it, I've always ended up using a temp file instead,
    > which is a kludge.[/color]

    The time machine lives!

    =============== ==========
    Add this file: Lib/encodings/pig.py
    ----------------------------------------
    "Pig Latin Codec -- Lib/encodings/pig.py"

    import codecs, re

    def encode(input, errors='strict' ):
    output = re.sub( r'\b(th|ch|st|\ w)(\w+)\b', r'\2\1ay', input)
    return (output, len(input))
    def decode(input, errors='strict' ):
    output = re.sub( r'(\b\w+?)(th|c h|st|\w)ay\b', r'\2\1', input)
    return (output, len(input))

    def getregentry():
    return (encode,decode, codecs.StreamRe ader,codecs.Str eamWriter)
    -------------------------------------------


    Now, fire-up Python:
    [color=blue][color=green][color=darkred]
    >>> 'hello world'.encode(' pig')[/color][/color][/color]
    'ellohay orldway'[color=blue][color=green][color=darkred]
    >>> 'ellohay orldway'.decode ('pig')[/color][/color][/color]
    'hello world'



    Raymond Hettinger





    Comment

    • Paul Rubin

      #3
      Re: Filtering through an external process

      "Raymond Hettinger" <vze4rx4y@veriz on.net> writes:[color=blue]
      > The time machine lives!
      >
      > =============== ==========
      > Add this file: Lib/encodings/pig.py
      > ----------------------------------------
      > "Pig Latin Codec -- Lib/encodings/pig.py"[/color]

      Chuckle :). But I had in mind a more general purpose means of running
      external processes.

      Comment

      • Scott David Daniels

        #4
        Re: Filtering through an external process

        Paul Rubin wrote:
        [color=blue]
        > Anyone know if there's code around to filter text through an external
        > process? Sort of like the Emacs "filter-region" command. For[/color]
        Check out popen2 -- its the piece you need.

        dest, result = os.popen2('cmd' )
        dest.write('ech o Hello world\n')
        dest.write('exi t\n')
        dest.close()
        result.read()


        So, perhaps you mean:
        import os

        def filtered(comman d, source):
        dest, result = os.popen2(comma nd)
        dest.write(sour ce)
        dest.close()
        try:
        return result.read()
        finally:
        result.close()


        -Scott David Daniels
        Scott.Daniels@A cm.Org

        Comment

        • Paul Rubin

          #5
          Re: Filtering through an external process

          Scott David Daniels <Scott.Daniels@ Acm.Org> writes:[color=blue][color=green]
          > > Anyone know if there's code around to filter text through an external
          > > process? Sort of like the Emacs "filter-region" command. For[/color]
          > Check out popen2 -- its the piece you need.[/color]

          No, that doesn't do the job. If you popen2 a process and send too
          much input without reading the output, the subprocess will block and
          your application will hang. That is explained in the docs. Doing it
          right is a little bit complicated. You need threads or asynchronous
          i/o. That's the functionality that's missing.

          Comment

          Working...