Confusing performance results for prime

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • Greg Brunet

    Confusing performance results for prime

    In doing some testing of different but simple algorithms for getting a
    list of prime numbers, I ended up getting some results that seem a bit
    contradictory. Given the following test program (testPrimes.py) with
    two algorithms that both check for primes by testing only odd numbers
    using factors up to the square root of the value, where Primes1 is based
    on all of the existing primes so far, and Primes2 is based on all odd
    numbers, I would expect that, as the number of primes we check gets
    larger, that Primes1 should get more & more efficient (since it should
    be performing less tests for each number). However the ratio seems to
    be relatively consistent. Another thing that is interesting is that
    once I tested up to n=200000, I got an exception in Primes2. I added
    the long typecast in the 'limit=' statement for both versions and the
    resulting improvements were strange: Primes1 runs 10-20% SLOWER, and
    Primes2 runs about 50% FASTER!

    I am not looking to come up with the most optimized solution to prime
    numbers here. Primes1 is faster than Primes2 as I would expect and it
    is clear, relatively elegant, and sufficient for me. However, I can't
    figure out why the ratio doesn't improve with size, and the strange
    results of the long typecast, and I would like to be able to understand
    what is causing their behavior. Any ideas? Thanks,

    --
    Greg


    #------------------------------------------------------------
    import math, time

    def Primes1(y):
    primes = [3]
    for p in range(5, y, 2):
    limit = math.sqrt(p) # or: long(math.sqrt( p))
    for factor in primes:
    if factor > limit: # then number is prime
    primes.append(p )
    break
    elif p % factor == 0: # then it's *not* prime
    break
    return [2] + primes

    def Primes2(y):
    primes = [2,3]
    for p in range(5, y, 2):
    limit = math.sqrt(p) + 1 # or: long(math.sqrt( p)) + 1
    # check one extra to let us see if there were no factors
    for n in range(3,limit+3 ,2):
    if p % n == 0: # then it's *not* prime
    break
    if n > limit:
    primes.append(p ) # then number is prime
    return primes


    if __name__ == "__main__":
    n=100000
    tb = time.time()
    l1 = Primes1(n)
    t1 = time.time() - tb
    print 'Primes1: %s primes up to %s in %5.3f secs' % (len(l1), n, t1)
    tb = time.time()
    l2 = Primes2(n)
    t2 = time.time() - tb
    print 'Primes2: %s primes up to %s in %5.3f secs' % (len(l2), n, t2)
    print 'ratio: %5.2f' % tt

    #------------------------------------------------------------

    I get the following results (avg for 3 runs each using Python 2.3.2 on
    Win XP Pro):

    n=50000 n=100000 n=200000
    w/out long:
    -----------
    Primes1: 0.41 0.95 2.24
    Primes2: 1.90 4.21 9.28
    Ratio: 4.63 4.42 4.14

    w/ long:
    -----------
    Primes1: 0.48 1.12 2.60
    Primes2: 0.84 2.04 4.75
    Ratio: 1.75 1.82 1.82



  • Bengt Richter

    #2
    Re: Confusing performance results for prime

    On Fri, 17 Oct 2003 15:33:24 -0500, "Greg Brunet" <gregbrunet@NOS PAMsempersoft.c om> wrote:
    [color=blue]
    >In doing some testing of different but simple algorithms for getting a
    >list of prime numbers, I ended up getting some results that seem a bit
    >contradictor y. Given the following test program (testPrimes.py) with
    >two algorithms that both check for primes by testing only odd numbers
    >using factors up to the square root of the value, where Primes1 is based
    >on all of the existing primes so far, and Primes2 is based on all odd
    >numbers, I would expect that, as the number of primes we check gets
    >larger, that Primes1 should get more & more efficient (since it should
    >be performing less tests for each number). However the ratio seems to
    >be relatively consistent. Another thing that is interesting is that
    >once I tested up to n=200000, I got an exception in Primes2. I added
    >the long typecast in the 'limit=' statement for both versions and the
    >resulting improvements were strange: Primes1 runs 10-20% SLOWER, and
    >Primes2 runs about 50% FASTER!
    >
    >I am not looking to come up with the most optimized solution to prime
    >numbers here. Primes1 is faster than Primes2 as I would expect and it
    >is clear, relatively elegant, and sufficient for me. However, I can't
    >figure out why the ratio doesn't improve with size, and the strange
    >results of the long typecast, and I would like to be able to understand
    >what is causing their behavior. Any ideas? Thanks,[/color]

    Without looking at the implementation code, I could only guess, but you
    are using range instead of xrange (but maybe range is now optimized not to build
    and actual list in iterator context?). Also, limit as a float or long being passed to
    range must cause a little hiccup. Perhaps an iter optimization is being interfered with?
    If range is not optimized, then you must be taking a hit from building throwaway
    lists with range instead of using xrange, so your inner loop in Primes2 would have
    a bunch of overhead. Also, you don't need floating point or long for quite some time,
    and x*x > y will be faster than x > math.sqrt(y) and avoid the float altogether
    and long to boot, unless you are generating primes larger than maxint.

    If you want to conserve space for an all-int primes list, you can use
    primes = array.array('l' ,[2]) instead of primes = [2] to start. Just doing that
    apparently gives about the same speed, but I imagine pre-allocating array space
    could make it faster, though you'd have to change the append and iteration logic.
    Hard to say without testing.
    [color=blue]
    >
    >--
    >Greg
    >[/color]
    Sorry, I can't resist suggesting a slight speed improvement ;-)

    def Primes3(prime_r ange_top):
    primes = [2]
    for prime_candidate in xrange(3,prime_ range_top,2):
    for p in primes:
    if prime_candidate %p==0: break
    if p*p>prime_candi date:
    primes.append(p rime_candidate)
    break
    else:
    primes.append(p rime_candidate)
    return primes

    And maybe running the bunch with

    if __name__ == "__main__":
    n=100000
    times = []
    for pfun in (Primes1, Primes2, Primes3):
    tb = time.time()
    l1 = pfun(n)
    dt = time.time() - tb
    times.append(dt )
    print '%s: %s primes up to %s in %5.3f secs' % (
    pfun.__name__, len(l1), n, dt)
    fastest = min(times)
    print '\nTimes relative to the fastest:'
    for i, pfun in enumerate((Prim es1, Primes2, Primes3)):
    print '%s: %5.3f'%(pfun.__ name__, times[i]/fastest)

    Which gave me (on older machine ;-)

    Primes1: 9592 primes up to 100000 in 2.984 secs
    Primes2: 9592 primes up to 100000 in 5.608 secs
    Primes3: 9592 primes up to 100000 in 2.444 secs

    Times relative to the fastest:
    Primes1: 1.221
    Primes2: 2.295
    Primes3: 1.000

    Regards,
    Bengt Richter

    Comment

    • Georgy Pruss

      #3
      Re: Confusing performance results for prime

      I did the tests for the original Primes1 & 2 (only changed
      range --> xrange and sqrt --> int(sqrt)) and your Primes3

      Primes1: 9592 primes up to 100000 in 0.547 secs
      Primes2: 9592 primes up to 100000 in 0.813 secs (+48.6%)
      Primes3: 9592 primes up to 100000 in 0.578 secs (+5.7%)

      Primes1: 49098 primes up to 600000 in 5.047 secs
      Primes2: 49098 primes up to 600000 in 8.844 secs (+75.2%)
      Primes3: 49098 primes up to 600000 in 5.922 secs (+17.3%)

      Primes1: 107126 primes up to 1400000 in 14.469 secs
      Primes2: 107126 primes up to 1400000 in 27.453 secs (+89.7%)
      Primes3: 107126 primes up to 1400000 in 17.203 secs (+18.9%)

      Georgy
      E^mail: 'ZDAwMTEyMHQwMz MwQGhvdG1haWwuY 29t\n'.decode(' base64')


      "Bengt Richter" <bokr@oz.net> wrote in message news:bmqfp9$1fu $0@216.39.172.1 22...[color=blue]
      > <...>
      > Sorry, I can't resist suggesting a slight speed improvement ;-)
      >
      > def Primes3(prime_r ange_top):
      > primes = [2]
      > for prime_candidate in xrange(3,prime_ range_top,2):
      > for p in primes:
      > if prime_candidate %p==0: break
      > if p*p>prime_candi date:
      > primes.append(p rime_candidate)
      > break
      > else:
      > primes.append(p rime_candidate)
      > return primes
      >
      > And maybe running the bunch with
      >
      > if __name__ == "__main__":
      > n=100000
      > times = []
      > for pfun in (Primes1, Primes2, Primes3):
      > tb = time.time()
      > l1 = pfun(n)
      > dt = time.time() - tb
      > times.append(dt )
      > print '%s: %s primes up to %s in %5.3f secs' % (
      > pfun.__name__, len(l1), n, dt)
      > fastest = min(times)
      > print '\nTimes relative to the fastest:'
      > for i, pfun in enumerate((Prim es1, Primes2, Primes3)):
      > print '%s: %5.3f'%(pfun.__ name__, times[i]/fastest)
      >
      > Which gave me (on older machine ;-)
      >
      > Primes1: 9592 primes up to 100000 in 2.984 secs
      > Primes2: 9592 primes up to 100000 in 5.608 secs
      > Primes3: 9592 primes up to 100000 in 2.444 secs
      >
      > Times relative to the fastest:
      > Primes1: 1.221
      > Primes2: 2.295
      > Primes3: 1.000
      >
      > Regards,
      > Bengt Richter[/color]


      Comment

      • Lulu of the Lotus-Eaters

        #4
        Re: Confusing performance results for prime

        bokr@oz.net (Bengt Richter) wrote previously:
        |In doing some testing of different but simple algorithms for getting a
        |list of prime numbers

        This general topic came up quite recently. I myself wrote a bunch about
        various data structures to cache primes and the like. But that was all
        fairly idle.

        If you want to test primality quickly, using the sieve of erotosthenes
        is pretty terrible. At least if you only want to check relatively few
        numbers, rather than assemble all the primes. Orders of magnitude
        better is to use Miller-Rabin pseudoprimality tests (well, complexity
        orders really).

        Pick the right algorithm before worrying about micro-optimizations.

        Yours, Lulu...

        Comment

        • Greg Brunet

          #5
          Re: Confusing performance results for prime

          "Bengt Richter" <bokr@oz.net> wrote in message
          news:bmqfp9$1fu $0@216.39.172.1 22...[color=blue]
          > On Fri, 17 Oct 2003 15:33:24 -0500, "Greg Brunet"[/color]
          <gregbrunet@NOS PAMsempersoft.c om> wrote:[color=blue]
          >
          > Without looking at the implementation code, I could only guess, but[/color]
          you[color=blue]
          > are using range instead of xrange (but maybe range is now optimized[/color]
          not to build[color=blue]
          > and actual list in iterator context?). Also, limit as a float or long[/color]
          being passed to[color=blue]
          > range must cause a little hiccup. Perhaps an iter optimization is[/color]
          being interfered with?[color=blue]
          > If range is not optimized, then you must be taking a hit from building[/color]
          throwaway[color=blue]
          > lists with range instead of using xrange, so your inner loop in[/color]
          Primes2 would have[color=blue]
          > a bunch of overhead. Also, you don't need floating point or long for[/color]
          quite some time,[color=blue]
          > and x*x > y will be faster than x > math.sqrt(y) and avoid the float[/color]
          altogether[color=blue]
          > and long to boot, unless you are generating primes larger than maxint.
          >
          > If you want to conserve space for an all-int primes list, you can use
          > primes = array.array('l' ,[2]) instead of primes = [2] to start. Just[/color]
          doing that[color=blue]
          > apparently gives about the same speed, but I imagine pre-allocating[/color]
          array space[color=blue]
          > could make it faster, though you'd have to change the append and[/color]
          iteration logic.[color=blue]
          > Hard to say without testing.
          >[/color]

          Bengt:

          I can understand that changing from range to xrange might help things
          slightly, and there are other areas as for improvement as well. It's
          the 2 areas that I mentioned that resulted in some unexpected results
          for me that I still didn't understand. I've taken your improvements on
          running & comparing multiple tests (neat stuff BTW) and found the
          following:

          - Changing from range to xrange doesn't make an appreciable
          difference (it flip-flops back & forth over multiple runs). I believe I
          had noticed this before, I've now got that actually built into the test
          now.

          - Getting rid of the Sqrt requires you to perform the (x*x>y) type
          test into the loop (whereas performing the sqrt operation allows you to
          move that calculation outside of the loop and do a simpler
          (factor>limit) test inside the loop). Even though I realized that this
          means I need to import math (at least the sqrt function), I figured that
          this still be better (which I think is Georgy's point). As I increase
          the upperlimit, the sqrt version gets better & better compared to the
          x*x version (as one would expect - trading off a more 'expensive'
          operation in order to move it outside of the loop). I did tests at
          100k, 200k, & 300k, and by 300k, the sqrt test was about 20% faster than
          the x*x test.

          I've noted some other things that I've tried and put an updated test
          program in a response to my original message. Thanks for your help -
          it's helped me understand where the improvements were coming from and
          that's what my main goal was all about.

          --
          Greg

          Comment

          • Greg Brunet

            #6
            Re: Confusing performance results for prime

            Thanks for the responses! After some more testing based on the
            responses I got, here's some additional notes:

            - xrange doesn't make an appreciable difference for this situation.

            - Comparing 'equivalent' Primes1 & Primes3 versions shows that the sqrt
            (outside the loop) is a worthwhile optimization (especially for larger
            tests) (confirming Georgy's post).

            - The reason I seed primes with 3 instead of 2 is that I know that I
            won't ever have a match on it (since my loop skips even numbers). This
            lets me skip an extra test each time thru the loop. If I use that in
            Bengt's x*x version, it cuts the disadvantage of that loop (compared to
            the sqrt one) about in half.

            - The really big improvement gain in Primes2 by performing the typecase
            (to either int or long) is most likely due to avoiding throwing an
            exception. I don't always see it (don't understand why it's not
            consistent), but sometimes I get a: "DeprecationWar ning: integer
            argument expected, got float" in the 'for' statement in Primes2.

            - I should have typecast to [int] instead of [long]. If I had done so,
            the results would have matched my expectations.

            - Once I got both Primes1 & 2 using comparable algorithms (see Primes1i
            & Primes2i below), Primes1 does become increasingly more efficient than
            Primes2 with longer lists, as one would expect, since it will perform
            relatively fewer loops.

            For anyone interested, here is my updated testing program. It tests
            these versions of Primes1: original, int typecast, int typecast w/
            xrange, int typecast w/ reversed if tests, long typecast. If you run it,
            you'll note that I skip testing the original Primes2, since the warning
            exception makes it run MUCH longer (like 400% times longer than
            Primes1i). It also has Bengt's Primes3 (and my change of it to skip
            [2], then add it back in: Primes3a). Finally, it has another
            interesting algorithm (Primes4) that I saw (in cookbook? - don't
            remember anymore). I also don't run it in the comparative tests because
            it is VERY slow (like 150x slower)! Be sure to run it long enough that
            the measurement intervals are meaningful. I default the test to 10000
            for older machines, but I'd recommend using a value large enough that
            the fastest test takes at least 1 second

            --
            Greg


            #------------------------------------------------------------
            import math, time, sys
            from math import sqrt
            #import psyco
            #psyco.full()

            # Greg Brunet, Oct 18, 2003

            # A check for a number being prime would generally see if
            # it is odd and not divisible by any odd numbers up to it's
            # square root. However, if we're generating a list of primes,
            # we'll use that list to reduce the factors that we check
            # against (Primes1 vs. Primes2).

            #initial test using list of primes
            def Primes1(y):
            primes = [3]
            for p in range(5, y, 2):
            limit = sqrt(p)
            for factor in primes:
            if factor > limit: # then number is prime
            primes.append(p )
            break
            elif p % factor == 0: # then it's *not* prime
            break

            #primes = [item for item in [2] + primes if x <= item < y]
            return [2] + primes

            #test using list of primes - w/ int typecast
            # (so camparison is with 2 ints)
            def Primes1i(y):
            primes = [3]
            for p in range(5, y, 2):
            limit = int(sqrt(p))
            for factor in primes:
            if factor > limit: # then number is prime
            primes.append(p )
            break
            elif p % factor == 0: # then it's *not* prime
            break

            #primes = [item for item in [2] + primes if x <= item < y]
            return [2] + primes

            #test using list of primes - w/ int typecast and xrange
            # (to see if xrange is faster - no noticable difference)
            def Primes1ix(y):
            primes = [3]
            for p in xrange(5, y, 2):
            limit = int(sqrt(p))
            for factor in primes:
            if factor > limit: # then number is prime
            primes.append(p )
            break
            elif p % factor == 0: # then it's *not* prime
            break

            #primes = [item for item in [2] + primes if x <= item < y]
            return [2] + primes

            #test using list of primes - w/ int typecast and reversing
            # internal if tests (to see if we'll execute less if's that
            # way (run faster) - no noticable difference)
            def Primes1ir(y):
            primes = [3]
            for p in range(5, y, 2):
            limit = int(sqrt(p))
            for factor in primes:
            if p % factor == 0: # then it's *not* prime
            break
            elif factor > limit: # then number is prime
            primes.append(p )
            break

            #primes = [item for item in [2] + primes if x <= item < y]
            return [2] + primes

            #test using list of primes - w/ long typecast
            # (initial optimization, before realizing that I should be
            # using int instead!)
            def Primes1l(y):
            primes = [3]
            for p in range(5, y, 2):
            limit = long(sqrt(p))
            for factor in primes:
            if factor > limit: # then number is prime
            primes.append(p )
            break
            elif p % factor == 0: # then it's *not* prime
            break

            #primes = [item for item in [2] + primes if x <= item < y]
            return [2] + primes


            #test using all odd numbers up to limit
            # (but having a float in 'for..range(... ' causes exception -> SLOW!)
            def Primes2(y):
            primes = [2,3]
            for p in range(5, y, 2):
            limit = sqrt(p) + 1
            # check one extra (limit+3) to let us use (n>limit) below to
            # know that we didn't break out of the loop
            for n in range(3,limit+3 ,2):
            if p % n == 0: # then it's *not* prime
            break
            if n > limit:
            primes.append(p ) # then number is prime

            return primes

            #test using all odd numbers up to limit - with int typecast
            # (gets rid of exception)
            def Primes2i(y):
            primes = [2,3]
            for p in range(5, y, 2):
            limit = int(sqrt(p)) + 1
            # check one extra (limit+3) to let us use (n>limit) below to
            # know that we didn't break out of the loop
            for n in range(3,limit+3 ,2):
            if p % n == 0: # then it's *not* prime
            break
            if n > limit:
            primes.append(p ) # then number is prime

            return primes

            #test using all odd numbers up to limit - with int typecast
            # (gets rid of exception, but not as fast as int!)
            def Primes2l(y):
            primes = [2,3]
            for p in range(5, y, 2):
            limit = long(sqrt(p)) + 1
            # check one extra (limit+3) to let us use (n>limit) below to
            # know that we didn't break out of the loop
            for n in range(3,limit+3 ,2):
            if p % n == 0: # then it's *not* prime
            break
            if n > limit:
            primes.append(p ) # then number is prime

            return primes

            #Bengt Richter's test avoiding sqrt call
            def Primes3(prime_r ange_top):
            primes = [2]
            for prime_candidate in xrange(3,prime_ range_top,2):
            for p in primes:
            if prime_candidate %p==0: break
            if p*p>prime_candi date:
            primes.append(p rime_candidate)
            break
            else:
            primes.append(p rime_candidate)
            return primes

            #Bengt Richter's test
            # - changed to skip checking for even (always false)
            def Primes3a(prime_ range_top):
            primes = [3]
            for prime_candidate in xrange(3,prime_ range_top,2):
            for p in primes:
            if prime_candidate %p==0: break
            if p*p>prime_candi date:
            primes.append(p rime_candidate)
            break
            else:
            primes.append(p rime_candidate)
            return [2] + primes

            #interesting approach, but VERY slow
            def Primes4(y):
            noprimes = [j for i in range(2, int(sqrt(y))+1) for j in range(i*2,
            y, i)]
            return [x for x in range(2, y) if x not in noprimes]

            if __name__ == "__main__":
            n=100000
            if len(sys.argv)>1 :
            n=int(sys.argv[1])

            times = []
            primes = []
            for pfun in (Primes1, Primes1i, Primes1ix, Primes1ir,
            Primes1l, Primes2i, Primes2l, Primes3, Primes3a):
            tb = time.time()
            p = pfun(n)
            primes.append(l en(p))
            dt = time.time() - tb
            times.append(dt )
            fastest = min(times)

            print 'Prime number algorithms/optimizations - up to:', n
            if (fastest==0):
            print '*** Percentages are invalid (fastest time is 0) ***'
            fastest = .01
            for i, pfun in enumerate((Prim es1, Primes1i, Primes1ix,
            Primes1ir, Primes1l, Primes2i,
            Primes2l, Primes3, Primes3a)):
            print '%s: %s primes up to %s in %5.3f secs (+%3.1f%%)' % (
            pfun.__name__, primes[i], n, times[i],
            (times[i]-fastest)*100/fastest)

            #------------------------------------------------------------

            Comment

            • Greg Brunet

              #7
              Re: Confusing performance results for prime

              "Lulu of the Lotus-Eaters" <mertz@gnosis.c x> wrote in message
              news:mailman.18 8.1066456959.21 92.python-list@python.org ...[color=blue]
              > bokr@oz.net (Bengt Richter) wrote previously:
              > |In doing some testing of different but simple algorithms for getting[/color]
              a[color=blue]
              > |list of prime numbers
              >
              > This general topic came up quite recently. I myself wrote a bunch[/color]
              about[color=blue]
              > various data structures to cache primes and the like. But that was[/color]
              all[color=blue]
              > fairly idle.
              >
              > If you want to test primality quickly, using the sieve of erotosthenes
              > is pretty terrible. At least if you only want to check relatively few
              > numbers, rather than assemble all the primes. Orders of magnitude
              > better is to use Miller-Rabin pseudoprimality tests (well, complexity
              > orders really).
              >
              > Pick the right algorithm before worrying about micro-optimizations.[/color]

              Well, actually I was trying to get the list of primes (as well as a
              count of them) below a specified value - so perhaps the sieve is the
              better algorithm to use. I also was just testing simple, easy to
              understand algorithms. Finally, the question was really more about
              understanding the cause of the behavior I was seeing after making the
              changes. I think that has been acheived. Thanks,


              --
              Greg

              Comment

              • Greg Brunet

                #8
                Re: Confusing performance results for prime

                Thanks for your results. That got me to do some more checking with
                algorithm differences, and I've posted those results in my response to
                Bengt. I guess I'm mentally stuck in 16 bit mode, forgetting that I
                could have cast to an int instead of a long.

                --
                Greg


                "Georgy Pruss" <SEE_AT_THE_END @hotmail.com> wrote in message
                news:jJ4kb.2226 0$VG6.1493@twis ter.southeast.r r.com...[color=blue]
                > I did the tests for the original Primes1 & 2 (only changed
                > range --> xrange and sqrt --> int(sqrt)) and your Primes3
                >
                > Primes1: 9592 primes up to 100000 in 0.547 secs
                > Primes2: 9592 primes up to 100000 in 0.813 secs (+48.6%)
                > Primes3: 9592 primes up to 100000 in 0.578 secs (+5.7%)
                >
                > Primes1: 49098 primes up to 600000 in 5.047 secs
                > Primes2: 49098 primes up to 600000 in 8.844 secs (+75.2%)
                > Primes3: 49098 primes up to 600000 in 5.922 secs (+17.3%)
                >
                > Primes1: 107126 primes up to 1400000 in 14.469 secs
                > Primes2: 107126 primes up to 1400000 in 27.453 secs (+89.7%)
                > Primes3: 107126 primes up to 1400000 in 17.203 secs (+18.9%)
                >
                > Georgy
                > E^mail: 'ZDAwMTEyMHQwMz MwQGhvdG1haWwuY 29t\n'.decode(' base64')[/color]

                Comment

                • Patrick Ellis

                  #9
                  Re: Confusing performance results for prime

                  "Greg Brunet" <gregbrunet@NOS PAMsempersoft.c om> wrote in message
                  news:vp3liuqs76 bd5a@corp.super news.com...[color=blue]
                  > Thanks for the responses! After some more testing based on the
                  > responses I got, here's some additional notes:
                  >
                  > <snipped the notes>[/color]

                  Here are some functions from an earlier discusion. Note that 1 and 2
                  are faster, but more memory intensive. They create a list (or array)
                  of all odd numbers and weed out the non-primes. 3 and 1ix only
                  create the primes, saving a lot of space.

                  =============== =============== =============== =====

                  from __future__ import generators
                  import Numeric
                  import math
                  import sys
                  import time

                  # Basically, Tim Peter's version of sieve.
                  def sieve1(n):
                  if n < 2:
                  return []
                  limit = int(math.sqrt(n ))
                  primes = range(1, n+1, 2)
                  primes[0] = 0
                  n = len(primes)
                  for p in primes:
                  if not p: continue
                  if p > limit: break
                  # note that p*p is odd, so no need to subtract 1
                  # before shifting right -- same thing in the end
                  for i in xrange((p*p)>>1 , n, p):
                  primes[i] = 0
                  primes[0] = 2
                  return filter(None, primes)

                  # The same as sieve1, but modified by me to use numeric for more speed.
                  def sieve2(n):
                  if n < 2:
                  return []
                  primes = Numeric.arrayra nge(1, n+1, 2)
                  limit = (int(math.sqrt( n)) / 2) + 1
                  for p in primes[1:limit]:
                  if p:
                  # note that p*p is odd, so no need to subtract 1
                  # before shifting right -- same thing in the end
                  primes[(p*p)>>1::p] = 0
                  return list(Numeric.no nzero(primes))

                  # Not sure where this came from, but I am sure I didn't write it. This is
                  # a generator, the function is below.
                  def sieve3gen(maxim um):
                  "Generate all primes <= maximum."
                  if maximum >= 2:
                  yield 2
                  if maximum >= 3:
                  yield 3
                  # candidate takes on all integer values > 3 that aren't divisible by
                  # 2 or 3.
                  candidate = 5
                  delta = 2 # flips between 2 and 4
                  # Map i to (d, 6*p), where d is 2*p or 4*p, p is a prime dividing i,
                  # and i+d is the next multiple of p larger than i not divisible by
                  # 2 or 3. As the algorithm proceeds, f will contain one entry for
                  # each prime <= sqrt(maximum) discovered so far (excepting 2 and 3).
                  f = {}
                  fetch = f.pop
                  adding = candidate**2 <= maximum
                  while candidate <= maximum:
                  if candidate in f:
                  d, s = fetch(candidate )
                  # candidate + d is next multiple of s/6 that isn't divisible
                  # by 2 or 3
                  i = candidate + d
                  d = s-d
                  while i in f:
                  i += d
                  d = s-d
                  f[i] = d, s
                  else:
                  yield candidate
                  if adding:
                  sq = candidate**2
                  f[sq] = delta * candidate, 6 * candidate
                  adding = sq <= maximum
                  candidate += delta
                  delta = 6 - delta

                  # Convert the generator into a function that matches the others.
                  def sieve3(n):
                  return [i for i in sieve3gen(n)]

                  # One of the algorithms from this thread, for comparison.
                  # test using list of primes - w/ int typecast and xrange
                  # (to see if xrange is faster - no noticable difference)
                  def Primes1ix(y):
                  primes = [3]
                  for p in xrange(5, y, 2):
                  limit = int(math.sqrt(p ))
                  for factor in primes:
                  if factor > limit: # then number is prime
                  primes.append(p )
                  break
                  elif p % factor == 0: # then it's *not* prime
                  break
                  return [2] + primes


                  if __name__ == "__main__":
                  n=1000000
                  if len(sys.argv) > 1:
                  n = int(sys.argv[1])

                  times = []
                  primes = []
                  for pfun in (sieve1, sieve2, sieve3, Primes1ix):
                  tb = time.time()
                  p = pfun(n)
                  primes.append(l en(p))
                  dt = time.time() - tb
                  times.append(dt )
                  fastest = min(times)

                  print 'Prime number algorithms/optimizations - up to:', n
                  if (fastest==0):
                  print '*** Percentages are invalid (fastest time is 0) ***'
                  fastest = .01
                  for i, pfun in enumerate((siev e1, sieve2, sieve3, Primes1ix)):
                  print '%s: %s primes up to %s in %5.3f secs (+%3.1f%%)' % (
                  pfun.__name__, primes[i], n, times[i],
                  (times[i]-fastest)*100/fastest)

                  =============== =============== =============== ======

                  Prime number algorithms/optimizations - up to: 1000000
                  sieve1: 78498 primes up to 1000000 in 0.593 secs (+99.7%)
                  sieve2: 78498 primes up to 1000000 in 0.297 secs (+0.0%)
                  sieve3: 78498 primes up to 1000000 in 1.078 secs (+263.0%)
                  Primes1ix: 78498 primes up to 1000000 in 10.438 secs (+3414.5%)


                  Comment

                  • Bengt Richter

                    #10
                    Re: Confusing performance results for prime

                    On Sat, 18 Oct 2003 17:20:34 -0500, "Greg Brunet" <gregbrunet@NOS PAMsempersoft.c om> wrote:
                    [...][color=blue]
                    >
                    >Bengt:
                    >
                    >I can understand that changing from range to xrange might help things
                    >slightly, and there are other areas as for improvement as well. It's
                    >the 2 areas that I mentioned that resulted in some unexpected results
                    >for me that I still didn't understand. I've taken your improvements on
                    >running & comparing multiple tests (neat stuff BTW) and found the
                    >following:
                    >
                    > - Changing from range to xrange doesn't make an appreciable
                    >difference (it flip-flops back & forth over multiple runs). I believe I
                    >had noticed this before, I've now got that actually built into the test
                    >now.[/color]
                    I guess there must be some optimization at work.
                    [color=blue]
                    >
                    > - Getting rid of the Sqrt requires you to perform the (x*x>y) type
                    >test into the loop (whereas performing the sqrt operation allows you to
                    >move that calculation outside of the loop and do a simpler
                    >(factor>limi t) test inside the loop). Even though I realized that this[/color]
                    Yes, that was a blunder on my part.
                    [color=blue]
                    >means I need to import math (at least the sqrt function), I figured that
                    >this still be better (which I think is Georgy's point). As I increase
                    >the upperlimit, the sqrt version gets better & better compared to the
                    >x*x version (as one would expect - trading off a more 'expensive'
                    >operation in order to move it outside of the loop). I did tests at
                    >100k, 200k, & 300k, and by 300k, the sqrt test was about 20% faster than
                    >the x*x test.
                    >
                    >I've noted some other things that I've tried and put an updated test
                    >program in a response to my original message. Thanks for your help -
                    >it's helped me understand where the improvements were coming from and
                    >that's what my main goal was all about.
                    >[/color]
                    Lulu is right about micro-optimization though ;-)

                    Regards,
                    Bengt Richter

                    Comment

                    Working...