Can I beat perl at grep-like processing speed?

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • js

    Can I beat perl at grep-like processing speed?

    Just my curiosity.
    Can python beats perl at speed of grep-like processing?


    $ wget http://www.gutenberg.org/files/7999/7999-h.zip
    $ unzip 7999-h.zip
    $ cd 7999-h
    $ cat *.htm bigfile
    $ du -h bigfile
    du -h bigfile
    8.2M bigfile

    ---------- grep.pl ----------
    #!/usr/local/bin/perl
    open(F, 'bigfile') or die;

    while(<F>) {
    s/[\n\r]+$//;
    print "$_\n" if m/destroy/oi;
    }
    ---------- END ----------
    ---------- grep.py ----------
    #!/usr/bin/env python
    import re
    r = re.compile(r'de stroy', re.IGNORECASE)

    for s in file('bigfile') :
    if r.search(s): print s.rstrip("\r\n" )
    ---------- END ----------

    $ time perl grep.pl pl.out; time python grep.py py.out
    real 0m0.168s
    user 0m0.149s
    sys 0m0.015s

    real 0m0.450s
    user 0m0.374s
    sys 0m0.068s
    # I used python2.5 and perl 5.8.6
  • Christophe Cavalaria

    #2
    Re: Can I beat perl at grep-like processing speed?

    js wrote:
    Just my curiosity.
    Can python beats perl at speed of grep-like processing?
    >
    >
    $ wget http://www.gutenberg.org/files/7999/7999-h.zip
    $ unzip 7999-h.zip
    $ cd 7999-h
    $ cat *.htm bigfile
    $ du -h bigfile
    du -h bigfile
    8.2M bigfile
    >
    ---------- grep.pl ----------
    #!/usr/local/bin/perl
    open(F, 'bigfile') or die;
    >
    while(<F>) {
    s/[\n\r]+$//;
    print "$_\n" if m/destroy/oi;
    }
    ---------- END ----------
    ---------- grep.py ----------
    #!/usr/bin/env python
    import re
    r = re.compile(r'de stroy', re.IGNORECASE)
    >
    for s in file('bigfile') :
    if r.search(s): print s.rstrip("\r\n" )
    ---------- END ----------
    >
    $ time perl grep.pl pl.out; time python grep.py py.out
    real 0m0.168s
    user 0m0.149s
    sys 0m0.015s
    >
    real 0m0.450s
    user 0m0.374s
    sys 0m0.068s
    # I used python2.5 and perl 5.8.6
    I'm thankful for the Python version or else, I'd never have guessed what
    that code was supposed to do!

    Try that :
    ---------- grep.py ----------
    #!/usr/bin/env python
    import re
    def main():
    search = re.compile(r'de stroy', re.IGNORECASE). search

    for s in file('bigfile') :
    if search(s): print s.rstrip("\r\n" )

    main()
    ---------- END ----------

    Comment

    • Nick Craig-Wood

      #3
      Re: Can I beat perl at grep-like processing speed?

      js <ebgssth@gmail. comwrote:
      Just my curiosity.
      Can python beats perl at speed of grep-like processing?
      >
      $ wget http://www.gutenberg.org/files/7999/7999-h.zip
      $ unzip 7999-h.zip
      $ cd 7999-h
      $ cat *.htm bigfile
      $ du -h bigfile
      du -h bigfile
      8.2M bigfile
      >
      #!/usr/local/bin/perl
      open(F, 'bigfile') or die;
      >
      while(<F>) {
      s/[\n\r]+$//;
      print "$_\n" if m/destroy/oi;
      }
      #!/usr/bin/env python
      import re
      r = re.compile(r'de stroy', re.IGNORECASE)
      >
      for s in file('bigfile') :
      if r.search(s): print s.rstrip("\r\n" )
      >
      $ time perl grep.pl pl.out; time python grep.py py.out
      real 0m0.168s
      user 0m0.149s
      sys 0m0.015s
      >
      real 0m0.450s
      user 0m0.374s
      sys 0m0.068s
      # I used python2.5 and perl 5.8.6
      Playing for the other side temporarily, this is nearly twice as fast...

      $ time perl -lne 'print if m/destroy/oi' bigfile >pl.out
      real 0m0.133s
      user 0m0.120s
      sys 0m0.012s

      vs

      $ time ./z.pl >pl.out.orig
      real 0m0.223s
      user 0m0.208s
      sys 0m0.016s

      Which gives the same output modulo a few \r

      --
      Nick Craig-Wood <nick@craig-wood.com-- http://www.craig-wood.com/nick

      Comment

      • Bruno Desthuilliers

        #4
        Re: Can I beat perl at grep-like processing speed?

        js a écrit :
        Just my curiosity.
        Can python beats perl at speed of grep-like processing?
        Probably not.
        >
        $ wget http://www.gutenberg.org/files/7999/7999-h.zip
        $ unzip 7999-h.zip
        $ cd 7999-h
        $ cat *.htm bigfile
        $ du -h bigfile
        du -h bigfile
        8.2M bigfile
        >
        ---------- grep.pl ----------
        #!/usr/local/bin/perl
        open(F, 'bigfile') or die;
        >
        while(<F>) {
        s/[\n\r]+$//;
        print "$_\n" if m/destroy/oi;
        }
        ---------- END ----------
        ---------- grep.py ----------
        #!/usr/bin/env python
        import re
        r = re.compile(r'de stroy', re.IGNORECASE)
        >
        for s in file('bigfile') :
        if r.search(s): print s.rstrip("\r\n" )
        ---------- END ----------
        Please notice that you're also benchmarking IO here - and perl seems to
        use a custom, highly optimized IO lib, that is much much faster than the
        system's one. I once made a Q&D cat-like comparison of perl, Python and
        C on my gentoo-linux box, and the perl version was insanely faster than
        the C one.

        Now the real question is IMHO: is the Python version fast enough ?

        My 2 cents..

        Comment

        • Fredrik Lundh

          #5
          Re: Can I beat perl at grep-like processing speed?

          Nick Craig-Wood wrote:
          > #!/usr/bin/env python
          > import re
          > r = re.compile(r'de stroy', re.IGNORECASE)
          >>
          > for s in file('bigfile') :
          > if r.search(s): print s.rstrip("\r\n" )
          footnote: if you're searching for literal strings with Python 2.5, using "in" is a
          lot faster than using re.search.

          </F>



          Comment

          Working...