Why is this loop heavy code so slow in Python? Possible Project Euler spoilers

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • Alex Martelli

    #16
    Re: Why is this loop heavy code so slow in Python? Possible Project Euler spoilers

    Mark Dickinson <dickinsm@gmail .comwrote:
    On Sep 2, 12:55 pm, al...@mac.com (Alex Martelli) wrote:
    Mark Dickinson <dicki...@gmail .comwrote:
    Well, for one thing, you're creating half a million xrange objects in
    the course of the search. All the C code has
    to do is increment a few integers.
    I don't think the creation of xrange objects is a meaningful part of
    Python's execution time here. Consider:
    [...]
    >
    Agreed---I just came to the same conclusion after doing some tests.
    So maybe it's the billion or so integer objects being created that
    dominate the running time? (Not sure how many integer objects
    actually are created here: doesn't Python cache *some* small
    integers?)
    Yep, some, say -5 to 100 or thereabouts; it also caches on a free-list
    all the "empty" integer-objects it ever has (rather than returning the
    memory for the system), so I don't think there's much optimization to be
    had on that score either.


    Alex

    Comment

    • Wildemar Wildenburger

      #17
      Re: Why is this loop heavy code so slow in Python? Possible ProjectEuler spoilers

      Martin v. Löwis wrote:
      >>(2) it is a interpretation language
      >Not quite. It's compiled to byte-code - just like Java (would you call
      >Java an 'interpreted language' ?)
      >
      Python is not implemented like Java. In Java (at least in HotSpot),
      the byte code is further compiled to machine code before execution;
      in Python, the byte code is interpreted.
      >
      OK, good. Naive question now comming to mind: Why doesn't Python do the
      latter as well?

      /W

      Comment

      • Steven D'Aprano

        #18
        Re: Why is this loop heavy code so slow in Python? Possible ProjectEuler spoilers

        On Sun, 02 Sep 2007 21:00:45 +0200, Wildemar Wildenburger wrote:
        Martin v. Löwis wrote:
        >>>(2) it is a interpretation language
        >>Not quite. It's compiled to byte-code - just like Java (would you call
        >>Java an 'interpreted language' ?)
        >>
        >Python is not implemented like Java. In Java (at least in HotSpot), the
        >byte code is further compiled to machine code before execution; in
        >Python, the byte code is interpreted.
        >>
        OK, good. Naive question now comming to mind: Why doesn't Python do the
        latter as well?
        >
        /W
        There is no single version of Java, and the reference interpretation runs
        on a virtual machine just like Python. Today there are virtual machine
        implementations of Java, native compilers, and Just In Time compilers for
        Java, including HotSpot mentioned by Martin, but Java the language was
        originally defined to run on a VM.

        See, for example, here: http://schmidt.devlib.org/java/compilers.html

        There are costs to native compilation, the biggest one of course being
        the initial investment in time and effort in creating the native
        compiler. Sun and other commercial companies have invested a lot of money
        in Java, and I don't think the money invested in Python has come even
        close. Consequently, any work into JIT compilation for Java has been done
        by volunteers.

        Nevertheless, we have Psyco, which is a JIT compiler of sorts; work also
        continues on PyPy (Python implemented in Python) which, it is hoped, will
        lead to faster Python implementations .

        Part of the reason that native compilation for Python is hard is that
        Python's dynamic object model makes it very difficult to apply the same
        sorts of compiler optimizations that C and Java allow. Comparatively
        little of the work done by Python can be safely pushed from run time to
        compile time, so it is unlikely that the average Python application will
        ever run as fast as the equivalent C code -- although that ignores the
        question of what "the equivalent C code" could possibly mean. (If the C
        code includes all the dynamic high-level features of Python, it too would
        surely run as slowly as Python, and if it didn't, it can hardly be said
        to be equivalent.) Nevertheless, by using something like Psyco parts of
        your Python code can run at virtually the same speed as C.

        A big question mark in my mind is Lisp, which according to aficionados is
        just as dynamic as Python, but has native compilers that generate code
        running as fast as highly optimized C. I'm not qualified to judge whether
        the lessons learnt from Lisp can be applied to Python, but in any case
        Lisp is an old, old language -- only Fortran is older. The amount of
        development effort and money put into Lisp dwarfs that put into Python by
        possibly a hundred or more.

        So... if you'd like to see Python run as fast as C or Lisp, and you have
        a few tens of millions of dollars spare to invest in development, I think
        the Python Software Foundation would love to hear from you.



        --
        Steven.

        Comment

        • Diez B. Roggisch

          #19
          Re: Why is this loop heavy code so slow in Python? Possible ProjectEuler spoilers

          Wildemar Wildenburger schrieb:
          Martin v. Löwis wrote:
          >>>(2) it is a interpretation language
          >>Not quite. It's compiled to byte-code - just like Java (would you call
          >>Java an 'interpreted language' ?)
          >>
          >Python is not implemented like Java. In Java (at least in HotSpot),
          >the byte code is further compiled to machine code before execution;
          >in Python, the byte code is interpreted.
          >>
          OK, good. Naive question now comming to mind: Why doesn't Python do the
          latter as well?
          because of the dynamic nature of it. Java is statically typed, so the
          JIT can heavily optimize.

          OTH psyco IS a JIT-compiler to optimize certain calculations which are
          mostly of a numerical nature. But this can't be done to the extend it is
          possible in java.

          Diez

          Comment

          • Chris Mellon

            #20
            Re: Why is this loop heavy code so slow in Python? Possible ProjectEuler spoilers

            On 9/2/07, Diez B. Roggisch <deets@nospam.w eb.dewrote:
            Wildemar Wildenburger schrieb:
            Martin v. Löwis wrote:
            >>(2) it is a interpretation language
            >Not quite. It's compiled to byte-code - just like Java (would you call
            >Java an 'interpreted language' ?)
            >
            Python is not implemented like Java. In Java (at least in HotSpot),
            the byte code is further compiled to machine code before execution;
            in Python, the byte code is interpreted.
            >
            OK, good. Naive question now comming to mind: Why doesn't Python do the
            latter as well?
            >
            because of the dynamic nature of it. Java is statically typed, so the
            JIT can heavily optimize.
            >
            OTH psyco IS a JIT-compiler to optimize certain calculations which are
            mostly of a numerical nature. But this can't be done to the extend it is
            possible in java.
            >
            Original code: 3 min, 9 seconds
            Original code with psyco: 30.28 seconds
            Original code, compiled with Pyrex: 1min 39 seconds (moves the for loops into C)
            Same, with a,b,c declared with cdef int: 20 seconds (uses c pow()
            instead of going through Python).
            Same, with the for..xrange loops rewritten to use C integer loops: 13
            seconds (saves xrange use, but since the python loop was already gone,
            not too much savings).

            With a small amount of work, you should be able to implement the C
            algorithm in Pyrex (or even just use the C algorithm, in a wrapper
            that converts the return value to an int) and get the same speed as
            the C version + a constant marshalling factor.

            Adding pysco took all of 20 seconds (half that because I needed to
            move the module scope code into a function), and re-writing with pyrex
            took a minute or two. So, as a demonstration of numeric optmization,
            both of them can give you quite good rewards with minimal effort.

            Comment

            • Neil Cerutti

              #21
              Re: Why is this loop heavy code so slow in Python? Possible Project Euler spoilers

              On 2007-09-02, Martin v. Löwis <martin@v.loewi s.dewrote:
              >>(2) it is a interpretation language
              >Not quite. It's compiled to byte-code - just like Java (would
              >you call Java an 'interpreted language' ?)
              >
              Python is not implemented like Java. In Java (at least in
              HotSpot), the byte code is further compiled to machine code
              before execution; in Python, the byte code is interpreted.
              >
              Whether this makes Python an interpreter or a compiler, I don't
              know.
              I'd call it an integrated compiler and virtual machine--a classic
              combination.

              --
              Neil Cerutti

              Comment

              • Paul Rubin

                #22
                Re: Why is this loop heavy code so slow in Python? Possible Project Euler spoilers

                Steven D'Aprano <steve@REMOVE-THIS-cybersource.com .auwrites: A big
                question mark in my mind is Lisp, which according to aficionados is
                just as dynamic as Python, but has native compilers that generate
                code running as fast as highly optimized C. I'm not qualified to
                judge whether the lessons learnt from Lisp can be applied to Python,
                but in any case Lisp is an old, old language -- only Fortran is
                older. The amount of development effort and money put into Lisp
                dwarfs that put into Python by possibly a hundred or more.
                Writing a simple Lisp compiler is not all that difficult. There's a
                chapter in SICP about how to do it, so it's sometimes part of
                introductory courses. To get C-like performance you end up having to
                rely on user-supplied type declarations and/or type inference, but
                even without that you can still do ok. I'd say Lisp is a less dynamic
                language than Python. Lisp doesn't have duck typing and doesn't
                represent class instance variables as dictionary elements that have to
                be looked up at runtime all the time. This maybe even applies to
                locals, e.g. in Python if you say

                x = 3
                print "hello"
                y = x + 4

                you're possibly not guaranteed that y=7, because you might have bound
                sys.stdout to something with a .write method that reaches up the call
                stack and messes with the caller's local variables, and if the langref
                manual says this is supposed to work, then the compiler has to do it
                like CPython. Maybe Python 4.0 will fix some of this stuff.

                Comment

                • jwrweatherley@gmail.com

                  #23
                  Re: Why is this loop heavy code so slow in Python? Possible Project Euler spoilers

                  Thanks for all the answers to my question. I think what I need to take
                  away from this is that xrange is an object - I thought it was just
                  some loop construct, and that maths is slow in python - so avoid
                  pathological looping.I remember the first time I tried Objective-C on
                  OS X I used the NSNumber class for arithmetic - the results that time
                  were pretty awful too!


                  Comment

                  • Paul Boddie

                    #24
                    Re: Why is this loop heavy code so slow in Python? Possible Project Euler spoilers

                    On 3 Sep, 15:39, jwrweather...@g mail.com wrote:
                    Thanks for all the answers to my question. I think what I need to take
                    away from this is that xrange is an object
                    Indeed, but using xrange can be faster than converting your "for"
                    loops to "while" loops plus a counter; I converted your code to use
                    the latter arrangement and it was noticeably slower. Perhaps the
                    repeated invocation of each xrange object's next method is less
                    expensive than repeatedly obtaining new incremented int objects and
                    testing them against other int objects.
                    - I thought it was just some loop construct, and that maths is slow in
                    python - so avoid pathological looping.
                    You'd be wise to avoid doing unnecessary work deep within nested loops
                    in any programming language. Sadly, it's not possible for the compiler
                    to work out that some calculations can be lifted out of the innermost
                    loops in Python, since the various guarantees that make such
                    operations possible in other languages are not offered by Python.
                    I remember the first time I tried Objective-C on OS X I used the
                    NSNumber class for arithmetic - the results that time were pretty
                    awful too!
                    Obviously, it'd be a fairer test if you used comparable numeric types
                    in both implementations , but a more capable numeric type would be
                    overkill for the C implementation of this program, and a less capable
                    numeric type doesn't exist for Python unless you're willing to use a
                    dialect such as Pyrex (as others have shown).

                    Paul

                    Comment

                    • Bruno Desthuilliers

                      #25
                      Re: Why is this loop heavy code so slow in Python? Possible ProjectEuler spoilers

                      Martin v. Löwis a écrit :
                      >>(2) it is a interpretation language
                      >Not quite. It's compiled to byte-code - just like Java (would you call
                      >Java an 'interpreted language' ?)
                      >
                      Python is not implemented like Java. In Java (at least in HotSpot),
                      the byte code is further compiled to machine code before execution;
                      This is a VM-specific feature.
                      in Python, the byte code is interpreted.
                      Idem.
                      Whether this makes Python an interpreter or a compiler,
                      I don't know.
                      This is an old troll^Mdiscussi on !-)

                      Now IANAL, but AFAIK, the byte-code compilation stage can make a great
                      difference performances-wide, and for a same language, a byte-compiled
                      implementation is likely to be faster than a pure-interpreted one, at
                      least because of the saving on parsing time (please someone correct me
                      if I'm wrong) ...

                      Comment

                      • paulhankin

                        #26
                        Re: Why is this loop heavy code so slow in Python? Possible Project Euler spoilers

                        On Sep 2, 12:51 pm, jwrweather...@g mail.com wrote:
                        ... right-angled triangle puzzle solver
                        >
                        max = 0
                        maxIndex = 0
                        index = 0
                        for solution in solutions:
                        if solution max:
                        max = solution
                        maxIndex = index
                        index += 1
                        >
                        print maxIndex
                        Not that it solves your slowness problem, or has anything to
                        do with the question you asked :), but you can replace this
                        last segment of your code with:

                        print solutions.index (max(solutions) )

                        --
                        Paul Hankin

                        Comment

                        Working...