perl to python

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • Oliver Fromme

    #16
    Re: perl to python

    Daniel 'Dang' Griffith <noemail@noemai l4u.com> wrote:[color=blue]
    > [on sed] One reason
    > to install it is that it's smaller than perl or python; another is
    > that it probably performs the task faster, since it isn't a general
    > purpose state machine;[/color]

    FWIW, sed _is_ a state machine, although not really "general
    purpose". It is a programming language with variables, loops
    and conditionals, and I believe it is turing-complete. Most
    of the time it is abused to perform simple search-and-replace
    tasks, though. ;-)

    But seriously ... I agree that the OP should really use sed
    instead of Python in this particular case, for the reasons
    that you've outlined.

    Best regards
    Oliver

    --
    Oliver Fromme, secnetix GmbH & Co KG, Oettingenstr. 2, 80538 Munich
    Any opinions expressed in this message may be personal to the author
    and may not necessarily reflect the opinions of secnetix in any way.

    "Python is an experiment in how much freedom programmers need.
    Too much freedom and nobody can read another's code; too little
    and expressiveness is endangered." -- Guido van Rossum

    Comment

    • Josef Meile

      #17
      Re: perl to python

      > There's definitely a sed available, possibly even in MingW (I have it[color=blue]
      > on my system, but am not sure if it arrived with MingW or something
      > else I installed). It's definitely available with cygwin. One reason
      > to install it is that it's smaller than perl or python; another is
      > that it probably performs the task faster, since it isn't a general
      > purpose state machine;[/color]
      Ok, If those two are true, then using it should be
      considered for big files.
      [color=blue]
      > another is that it's 25% shorter to type than
      > perl and 50% shorter to type than python.[/color]
      I don't think that shorter codes are always the
      most efficient. They are nicer, but you can't
      assure that they are faster. For example a simple
      sort algoritm implemented with two anidated loops:
      It is well known that you can use trees or other
      strategies to achieve better results; however,
      some of them are larger as the loop implementation

      Comment

      • Roy Smith

        #18
        Re: perl to python

        Oliver Fromme <olli@haluter.f romme.com> wrote:[color=blue]
        > FWIW, sed _is_ a state machine, although not really "general
        > purpose". It is a programming language with variables, loops
        > and conditionals, and I believe it is turing-complete. Most
        > of the time it is abused to perform simple search-and-replace
        > tasks, though. ;-)[/color]

        I would disagree that the "simple search-and-replace" usage is abuse.
        It's just using the tool to do what it's best at. Sure, there are some
        more complex things you can do in sed, but the syntax is so baroque it
        quickly becomes trying to bash a screw with a hammer.

        In the old days, when the task became too complicated for sed, you
        switched to awk. When things got even more complex, you pasted sed,
        grep, awk, and shell together in various ways, and perl was invented to
        cover all those functionalities in a single language.

        In a sense, perl suffers from the same disease that C++ does; a desire
        to maintain backwards compatability with its parents (thus the absurdly
        eclectic syntax) while at the same time adding every new feature you
        could imagine (and some that you can't).

        Anyway, I think there's a lot of value in learning tools like grep and
        sed, and using them when appropriate. The example that started this
        thread is the canonical example of what sed does best. Sure, you can
        make a general-purpose tool like Python do that job, but other than
        proving that you can do it, I don't see any reason to bother.

        Comment

        • Kirk Job-Sluder

          #19
          Re: perl to python

          On 2004-05-11, Daniel 'Dang' Griffith <noemail@noemai l4u.com> wrote:[color=blue]
          > There's definitely a sed available, possibly even in MingW (I have it
          > on my system, but am not sure if it arrived with MingW or something
          > else I installed). It's definitely available with cygwin. One reason
          > to install it is that it's smaller than perl or python; another is
          > that it probably performs the task faster, since it isn't a general
          > purpose state machine; another is that it's 25% shorter to type than
          > perl and 50% shorter to type than python.[/color]


          There is also a windows-native ssed (super sed).
          [color=blue]
          > --dang[/color]

          Comment

          • Kirk Job-Sluder

            #20
            Re: perl to python

            On 2004-05-11, Duncan Booth <me@privacy.net > wrote:[color=blue]
            > Kirk Job-Sluder <kirk@eyegor.jo bsluder.net> wrote in
            > Your code might have been a bit shorter if you had used the existing
            > facility in Python for editing files in place. The code below is completely
            > untested, so I can all but guarantee it doesn't work, but you get the idea:
            >
            > #!/usr/local/bin/python
            > import getopt,sys,os,r e
            > import fileinput[/color]

            Thanks! Learn something new every day. I would argue that length of
            code is less an issue than the nasty exec statement.

            Comment

            • Ville Vainio

              #21
              Re: perl to python

              >>>>> "Roy" == Roy Smith <roy@panix.co m> writes:

              Roy> Anyway, I think there's a lot of value in learning tools like
              Roy> grep and sed, and using them when appropriate. The example

              I tend to think pretty much the opposite. Most of the time you can do
              things as easily with Python, with the added advantage of robust
              exception handling (errors not passing silently) and not having to
              learn the other things. You only need to know one regexp
              syntax. Windows can also be quite unpredictable w/ customary Unix
              tools. Cygwin has burned me a few times too many.

              The things you usually do with the non-python tools are trivial, and
              trivial things have the habit of being, well, trivial in Python too.

              Roy> does best. Sure, you can make a general-purpose tool like
              Roy> Python do that job, but other than proving that you can do
              Roy> it, I don't see any reason to bother.

              You can always implement modules to do the tasks you normally use sed
              or awk for. I never saw much virtue in using the most specialized (or
              crippled, if you wish) tool possible. Not even if it's "optimized" for
              the thing. Actually, I tend to think that Python has to some extent
              deprecated that part of the Unix tradition.

              It's funny, but somehow I can't really think of cases that a
              specialized language would do better (ignoring the performace, which
              is rarely a concern in sysadmin tasks) than Python with some
              modules. Specialized languages were great at time when the general
              purpose languages sucked, but that's not the case anymore.

              And yes, I'm aware that I'm exposing myself to some serious flammage
              from "if it was good enough for my grandad, it's good enough for me"
              *nix crowd. Emotional attachment to various cute little tools is
              understandable, but sometimes it's good to take a fresh perspective
              and just let go.

              --
              Ville Vainio http://tinyurl.com/2prnb

              Comment

              • Peter Hickman

                #22
                Re: perl to python

                Ville Vainio wrote:[color=blue]
                > It's funny, but somehow I can't really think of cases that a
                > specialized language would do better (ignoring the performace, which
                > is rarely a concern in sysadmin tasks) than Python with some
                > modules.[/color]

                There is more to computer usage than sysadmin tasks, sed is an ideal
                tool for processing large sets of large files (I have to handle small
                files that are only 130 Mb in size, and I have around 140,000 of them).

                Performance is not an issue you can ignore when you are handling large
                amounts of data. Long may sed and awk live, just have to make sure that
                the O'Reilly's are to hand because the syntax is a bugger.

                Comment

                • Pete Forman

                  #23
                  Re: perl to python

                  Jason Mobarak <jmob@spam__unm .edu> writes:[color=blue]
                  > John Roth wrote:[color=green]
                  > > "Olivier Scalbert" <olivier.scalbe rt@algosyn.com> wrote in message
                  > > news:409e86e9$0 $22811$a0ced6e1 @news.skynet.be ...[color=darkred]
                  > > > What is the python way of doing this :
                  > > > perl -pi -e 's/string1/string2/' file[/color]
                  > >
                  > > I'm not sure what the -pi and -e switches do, but the rest is
                  > > fairly simple, although not as simple as the perl one-liner.
                  > > Just load the file into a string variable, and either use the
                  > > string .replace() method, or use a regx, depending on which is
                  > > appropriate. Then write it back out.
                  > > [...][/color]
                  >
                  > More obfuscated:
                  >
                  > python -c '(lambda fp: fp.write(fp.see k(0) or
                  > "".join([L.replace("th", "ht") for L in fp])))(file("foo", "rw+"))'[/color]

                  For a less obfuscated approach, look at PyOne to run short python
                  scripts from a one-line command.



                  --
                  Pete Forman -./\.- Disclaimer: This post is originated
                  WesternGeco -./\.- by myself and does not represent
                  pete.forman@wes terngeco.com -./\.- opinion of Schlumberger, Baker
                  http://petef.port5.com -./\.- Hughes or their divisions.

                  Comment

                  • Kirk Job-Sluder

                    #24
                    Re: perl to python

                    On 2004-05-11, Ville Vainio <ville@spammers .com> wrote:[color=blue]
                    > The things you usually do with the non-python tools are trivial, and
                    > trivial things have the habit of being, well, trivial in Python too.[/color]

                    I've not found this to be the case due to Python's emphasis on being
                    explicit rather than implicit. My emulation of
                    "perl -pi -e" was about 24 lines in length. Even with the improvement
                    there is still 10 times as many statements where things can go wrong.

                    It is really hard to be more trivial than a complete program in one
                    command line.
                    [color=blue]
                    > You can always implement modules to do the tasks you normally use sed
                    > or awk for. I never saw much virtue in using the most specialized (or
                    > crippled, if you wish) tool possible. Not even if it's "optimized" for
                    > the thing. Actually, I tend to think that Python has to some extent
                    > deprecated that part of the Unix tradition.[/color]

                    However, that raises its own host of problems such as how do you import
                    the needed modules on the command line? What do you do when that module is not
                    available? What do you do when you need additional functionality that
                    takes one line in awk but a major rewrite in python?

                    It's a matter of task efficiency. Why should I spend a half hour doing
                    in python something that takes 1 minute if you know the right sed, awk
                    or perl one-liner? There is a level of complexity where you are better
                    off using python. But why not use a one-liner when it is available?
                    [color=blue]
                    > And yes, I'm aware that I'm exposing myself to some serious flammage
                    > from "if it was good enough for my grandad, it's good enough for me"
                    > *nix crowd. Emotional attachment to various cute little tools is
                    > understandable, but sometimes it's good to take a fresh perspective
                    > and just let go.[/color]

                    Write me a two-line script in python that reads a character delimited
                    file, and printf pretty-prints all of the records in a different order.

                    Sometimes, a utility that uses an implicit loop over every line of a
                    file is useful. That's not emotional attachment, it's plain common
                    sense.

                    Comment

                    • Carl Banks

                      #25
                      Re: perl to python

                      Kirk Job-Sluder wrote:[color=blue]
                      > Write me a two-line script in python that reads a character delimited
                      > file, and printf pretty-prints all of the records in a different order.[/color]

                      How about one line (broken into three for clarity):

                      for line in __import__('sys ').stdin:
                      print ''.join([ x.rjust(10) for x in map(
                      line.strip().sp lit(',').__geti tem__,[4,3,2,1,0]) ])

                      Believe it or not, I actually do stuff like this on the command line
                      once in awhile; to me, it's less effort to type this in than to
                      remember (read: look up) the details of awk syntax. I don't think I'm
                      typical in this regard, though.


                      --
                      CARL BANKS http://www.aerojockey.com/software
                      "If you believe in yourself, drink your school, stay on drugs, and
                      don't do milk, you can get work."
                      -- Parody of Mr. T from a Robert Smigel Cartoon

                      Comment

                      • Kirk Job-Sluder

                        #26
                        Re: perl to python

                        On 2004-05-12, Carl Banks <imbosol@aerojo ckey.invalid> wrote:[color=blue]
                        > Kirk Job-Sluder wrote:[color=green]
                        >> Write me a two-line script in python that reads a character delimited
                        >> file, and printf pretty-prints all of the records in a different order.[/color]
                        >
                        > How about one line (broken into three for clarity):
                        >
                        > for line in __import__('sys ').stdin:
                        > print ''.join([ x.rjust(10) for x in map(
                        > line.strip().sp lit(',').__geti tem__,[4,3,2,1,0]) ])
                        >
                        > Believe it or not, I actually do stuff like this on the command line
                        > once in awhile; to me, it's less effort to type this in than to
                        > remember (read: look up) the details of awk syntax. I don't think I'm
                        > typical in this regard, though.[/color]

                        This looks like using the proverbial hammer to drive the screw.

                        I still find:
                        awk 'BEGIN {FS="\t"} {printf("patter n", $1,$4,$3,$2)}' file

                        to be more elegant and easier to debug. It does the required task in
                        two easy-to remember statements.


                        Comment

                        • Ville Vainio

                          #27
                          Re: perl to python

                          >>>>> "Kirk" == Kirk Job-Sluder <kirk@eyegor.jo bsluder.net> writes:

                          Kirk> I've not found this to be the case due to Python's emphasis
                          Kirk> on being explicit rather than implicit. My emulation of
                          Kirk> "perl -pi -e" was about 24 lines in length. Even with the
                          Kirk> improvement there is still 10 times as many statements where
                          Kirk> things can go wrong.

                          That's when you create a module which does the implicit looping. Or a
                          python script that evals the passed expression string in the loop.

                          Kirk> It is really hard to be more trivial than a complete program in one
                          Kirk> command line.

                          As has been stated elsewhere, you can do the trick on the command
                          line. The effort to create the required tools only needs to be paid
                          once.

                          However, many times it won't matter whether the whole program fits on
                          the command line. I always do a script into a file and then execute
                          it. I just prefer a real editor to command history editing if
                          something goes wrong.

                          Kirk> It's a matter of task efficiency. Why should I spend a half
                          Kirk> hour doing in python something that takes 1 minute if you
                          Kirk> know the right sed, awk or perl one-liner? There is a level
                          Kirk> of complexity where you are better off using python. But
                          Kirk> why not use a one-liner when it is available?

                          I think one should just analyze the need, implement the requisite
                          module(s) and the script to invoke the stuff in modules. The needs
                          have the habit of repeating themselves, and having a bit more
                          structure in the solution will pay off.

                          Kirk> Write me a two-line script in python that reads a character
                          Kirk> delimited file, and printf pretty-prints all of the records
                          Kirk> in a different order.

                          (Already done)

                          Kirk> Sometimes, a utility that uses an implicit loop over every line of a
                          Kirk> file is useful. That's not emotional attachment, it's plain common
                          Kirk> sense.

                          The virtual of the implicitness is still arguable.

                          --
                          Ville Vainio http://tinyurl.com/2prnb

                          Comment

                          • Ville Vainio

                            #28
                            Daily Python URL alert (was Re: perl to python

                            >>>>> "Pete" == Pete Forman <pete.forman@we sterngeco.com> writes:

                            Pete> For a less obfuscated approach, look at PyOne to run short python
                            Pete> scripts from a one-line command.

                            Pete> http://www.unixuser.org/~euske/pyone/

                            Looks exactly like something I always wanted to implement, but found
                            that doing the script in a multi-line file is easier. It's great that
                            someone has got around to imlpement something like this.

                            There should be a wiki entry for "quick and dirty python" (sounds
                            somehow... suspicious ;-), having awk/sed/oneliner workalikes.

                            --
                            Ville Vainio http://tinyurl.com/2prnb

                            Comment

                            • Kirk Job-Sluder

                              #29
                              Re: perl to python

                              On 2004-05-12, Ville Vainio <ville@spammers .com> wrote:[color=blue][color=green][color=darkred]
                              >>>>>> "Kirk" == Kirk Job-Sluder <kirk@eyegor.jo bsluder.net> writes:[/color][/color]
                              >
                              > Kirk> I've not found this to be the case due to Python's emphasis
                              > Kirk> on being explicit rather than implicit. My emulation of
                              > Kirk> "perl -pi -e" was about 24 lines in length. Even with the
                              > Kirk> improvement there is still 10 times as many statements where
                              > Kirk> things can go wrong.
                              >
                              > That's when you create a module which does the implicit looping. Or a
                              > python script that evals the passed expression string in the loop.[/color]

                              Except now you've just eliminated portability, one of the main arguments
                              for using python in the first place.

                              And here is the fundamental question. Why should I spend my time
                              writing a module in python to emulate another tool, when I can simply
                              use that other tool? Why should I, as a resarcher who must process
                              large quantities of data, spend my time and my employer's money
                              reinventing the wheel?

                              [color=blue]
                              > Kirk> It is really hard to be more trivial than a complete program in one
                              > Kirk> command line.
                              >
                              > As has been stated elsewhere, you can do the trick on the command
                              > line. The effort to create the required tools only needs to be paid
                              > once.[/color]

                              One can do the trick one one command line in python. However that
                              command line is an ugly inelegant hack that eliminates the most
                              important advantage of python: clear, easy to understand code. In
                              addition, that example still required 8 python statements compared to
                              two in awk.
                              [color=blue]
                              > However, many times it won't matter whether the whole program fits on
                              > the command line. I always do a script into a file and then execute
                              > it. I just prefer a real editor to command history editing if
                              > something goes wrong.[/color]

                              Which is what I do as well. The question is, why should I write 8
                              python statements to perform a task that I can do in two using awk, or
                              sed? Why should I spend 30 minutes writing, testing and debugging a
                              python script that takes 5 minutes to write in awk, or sed taking
                              advantage of the implicit loops and record splitting.
                              [color=blue]
                              > I think one should just analyze the need, implement the requisite
                              > module(s) and the script to invoke the stuff in modules. The needs
                              > have the habit of repeating themselves, and having a bit more
                              > structure in the solution will pay off.[/color]

                              I think you are missing a key step. You are starting off with a
                              solution (python scripts and modules) and letting it drive your
                              needs analysis. I don't get paid enough money to write pythonic
                              solutions to problems that have already been fixed using other tools.
                              [color=blue]
                              > The virtual of the implicitness is still arguable.[/color]

                              I'll be more specific about the challenge. Using only stock python with
                              no added modules, give me a script that pretty-prints a
                              character-delimted file using one variable assignment, and one function.

                              Here is the solution in awk:
                              BEGIN { FS="\t" }
                              {printf("%s %s %s %s", $4, $3, $2, $1)}


                              Comment

                              • Roy Smith

                                #30
                                Re: perl to python

                                Kirk Job-Sluder <kirk@eyegor.jo bsluder.net> wrote:[color=blue]
                                > And here is the fundamental question. Why should I spend my time
                                > writing a module in python to emulate another tool, when I can simply
                                > use that other tool? Why should I, as a resarcher who must process
                                > large quantities of data, spend my time and my employer's money
                                > reinventing the wheel?[/color]

                                At the risk of veering this thread in yet another different direction,
                                anybody who does analysis of large amounts of data should take a look at
                                Gary Perlman's excellent, free, and generally under-appreciated |STAT
                                package.

                                The URL you've requested could not be found. Please browse or search our site for the search terms that are most relevant to your request. We are sorry for the inconvenience, and look forward to your feedback on our new site -- please email us at web-team@acm.org.


                                It's been around in one version or another for something like 20 years.
                                It fills an interesting little niche that's part data manipulation and
                                part statistics.
                                [color=blue]
                                > Here is the solution in awk:
                                > BEGIN { FS="\t" }
                                > {printf("%s %s %s %s", $4, $3, $2, $1)}[/color]

                                In |STAT, that would be simply "colex 4 3 2 1".

                                There's nothing you can do in |STAT that you couldn't do with more
                                general purpose tools like awk, perl, python, etc, but |STAT often has a
                                quicker, simpler, easier way to do many common statistical tasks. A
                                good tool to have in your toolbox.

                                For example, on of the cool tools is the "validata". You feed it a file
                                and it applies some heuristics trying to guess which data in it might be
                                invalid. For example, if a file looks like it's columns of numbers, and
                                the third column is all integers except for one entry which is a
                                floating point number, it'll guess that might be an error and flag it.
                                It's great when you're analyzing 5000 log files of 100,000 lines each
                                and one of them makes your script crash for no apparent reason.

                                Comment

                                Working...