Python's biggest compromises

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • Robin Becker

    #61
    Re: Python's biggest compromises

    In article <bgr96h$a9$1@pa nix3.panix.com> , Aahz <aahz@pythoncra ft.com>
    writes
    .....[color=blue]
    >
    >(Yes, there are issues with Python on SMP machines, but to call Python's
    >built-in threading "non-existent SMP scalability" is either a lie or
    >revelatory of near-complete ignorance. That doesn't even count the
    >various IPC mechanisms.)[/color]
    I'm not an expert, but the various grid computation schemes seem to
    prefer either java or c/c++, I suspect that those schemes aren't really
    using threads in main, after all they seem to be running between
    machines in different parts of the world even. I suspect Python would be
    in better shape if we could migrate threads or tasklets from one
    processor to another.

    I believe pyro can almost do that, but I haven't tried it.
    --
    Robin Becker

    Comment

    • Syver Enstad

      #62
      Re: Python's biggest compromises

      aahz@pythoncraf t.com (Aahz) writes:
      [color=blue]
      > (Yes, there are issues with Python on SMP machines, but to call
      > Python's built-in threading "non-existent SMP scalability" is either
      > a lie or revelatory of near-complete ignorance. That doesn't even
      > count the various IPC mechanisms.)[/color]

      It's an interesting subject though. How does python threading on SMP
      machines compare with f.ex. Java and C++. I know that at least the
      MSVC compiler has a GIL like problem with heap access (new, malloc,
      delete, free), which is guarded with a global lock.

      Would migrating the global data for a thread to some sort of thread
      local storage help Python SMP performance? If Java has better
      threading performance than Python how have they solved the interpreter
      state problem. Java is interpreted isn't it?
      --

      Vennlig hilsen

      Syver Enstad

      Comment

      • enoch

        #63
        Re: Python's biggest compromises

        aahz@pythoncraf t.com (Aahz) wrote in message news:<bgr96h$a9 $1@panix3.panix .com>...[color=blue]
        > In article <ad02da8c.03080 60705.7ff1fa4f@ posting.google. com>,
        > enoch <enoch@gmx.ne t> wrote:[color=green]
        > >anthony_barker @hotmail.com (Anthony_Barker ) wrote in message news:<899f842.0 307310555.56134 f71@posting.goo gle.com>...[color=darkred]
        > >>
        > >> What to you think python largest compromises are?[/color]
        > >
        > >Its non existant SMP scalability.[/color]
        >
        > Would you care to back up your claim with some actual evidence?
        >
        > (Yes, there are issues with Python on SMP machines, but to call Python's
        > built-in threading "non-existent SMP scalability" is either a lie or
        > revelatory of near-complete ignorance.[/color]

        Ok, I confess, the term you cited might be little bit exaggerated. But
        there's no need to get personal. I'm surely not a liar (w.r.t. to this
        thread, everything else is not a matter of public concern ;) ). The
        ignorance part, well, we can talk about that ...
        [color=blue]
        > That doesn't even count the various IPC mechanisms.)[/color]

        Correct me if I'm wrong, but I don't think any form of IPC is a
        measurement of scalability of something like the python interpreter.

        Here are some sources which show that I'm not alone with my assessment
        that python has deficiencies w.r.t. SMP systems:

        The official home of the Python Programming Language

        """
        It is optimal, however, to avoid requiring threads for any part of a
        framework. Threading has a significant cost, especially in Python. The
        global interpreter lock destroys any performance benefit that
        threading may yield on SMP systems, [...]
        """



        (note the author of that post)
        """[color=blue]
        >My project will be running on an SMP box and requires scalability.
        >However, my test shows that Python threading has very poor[/color]
        performance[color=blue]
        >in terms of scaling. In fact it doesn't scale at all.[/color]

        That's true for pure Python code.
        """

        I'm aware that you know quite well about these facts, so I'll leave it
        at that. But let me just add one more link which maybe you don't know:



        """
        Well, in worst case, it can actually give you performance UNDER 1X.
        The latency switching the GIL between CPUs comes right off your
        ability to do work in a quanta. If you have a 1 gigahertz machine
        capable of doing 12,000 pystones of work, and it takes 50 milliseconds
        to switch the GIL(I dont know how long it takes, this is an example)
        you would lose 5% of your peak performance for *EACH* GIL switch.
        Setting sys.setchechint erval(240) will still yield the GIL 50 times a
        second. If the GIL actually migrates only 10% of the time its
        released, that would 50 * .1 * 5% = 25% performance loss. The cost
        to switch the GIL is going to vary, but will probably range between .1
        and .9 time quantas (scheduler time intervals) and a typical time
        quanta is 5 to 10ms.
        [...]
        However, I have directly observed a 30% penalty under MP constraints
        when the sys.setcheckint erval value was too low (and there was too
        much GIL thrashing).
        """

        So, although python is capable of taking advantage of SMP systems
        under certain circumstances (I/O bound systems etc. etc.), there are
        real world situations where python's performance is _hurt_ by running
        on a SMP system.
        Btw. I think even IPC might not help you there, because the different
        processes might bounce betweeen CPUs, so only processor binding might
        help.



        I did quite a bit of googling on this problem - several times -
        because I'm selling zope solutions. Sometimes, the client wants to run
        the solution on an existing SMP system, and worse, the system has to
        fulfill some performance requirements. Then I have the problem of
        explaining to him that his admins need to undertake some special tasks
        in order for zope to be able to exploit the multiple procs in his
        system.




        Aazh, I'm lurking this newsgroup since approx. 3 years, so I know who
        you are. You have participated in nearly any discussion about threads,
        I know your slides, and there's no doubt that you have forgotten more
        about this subject than I'll never know.

        Comment

        • Aahz

          #64
          Re: Python's biggest compromises

          In article <uznim5ynx.fsf@ online.no>,
          Syver Enstad <syver-en+usenet@onlin e.no> wrote:[color=blue]
          >aahz@pythoncra ft.com (Aahz) writes:[color=green]
          >>
          >> (Yes, there are issues with Python on SMP machines, but to call
          >> Python's built-in threading "non-existent SMP scalability" is either
          >> a lie or revelatory of near-complete ignorance. That doesn't even
          >> count the various IPC mechanisms.)[/color]
          >
          >It's an interesting subject though. How does python threading on SMP
          >machines compare with f.ex. Java and C++. I know that at least the
          >MSVC compiler has a GIL like problem with heap access (new, malloc,
          >delete, free), which is guarded with a global lock.[/color]

          Sure, but that's not where a C++ application usually spends its time.
          [color=blue]
          >Would migrating the global data for a thread to some sort of thread
          >local storage help Python SMP performance? If Java has better
          >threading performance than Python how have they solved the interpreter
          >state problem. Java is interpreted isn't it?[/color]

          Well, that's a good question. *Does* Java have better threading
          performance than Python? If it does, to what extent is that performance
          bought at the cost of complexity for the programmer?

          Keep in mind that the GIL exists not because of issues with thread-local
          storage but because every Python object is global and can have bindings
          to it in any -- or every -- thread. Python uses objects *everywhere*;
          the GC uses Python objects, stack frames are Python objects, modules are
          Python objects. To create "thread-local" storage as you suggest would
          require a wholesale revision of Python's object model that would make it
          something other than what Python is today.

          Based on recent discussions about restricted execution, I suspect that
          security would be much more likely to drive such changes; if that
          happens, perhaps revisiting the way GIL works might happen with it.
          --
          Aahz (aahz@pythoncra ft.com) <*> http://www.pythoncraft.com/

          This is Python. We don't care much about theory, except where it intersects
          with useful practice. --Aahz

          Comment

          • enoch

            #65
            Re: Threading advantages (was Re: Python's biggest compromises)

            aahz@pythoncraf t.com (Aahz) wrote in message news:<bgs4ud$h3 g$1@panix3.pani x.com>...[color=blue]
            > <snip>
            > Since, as you say, you've done some research, that's why I flamed you.
            > There's just no call for making such an overstated claim -- it is *NOT*
            > "a little bit exaggerated".[/color]

            Well, I based this phrase on the fact that while under some
            circumstances (e.g. your web spider) python does scale somewhat, under
            others (e.g. zope) it may perform even worse on a SMP system. If you
            sum these two facts up ...

            [color=blue]
            > <snip IPC>[color=green]
            > >Here are some sources which show that I'm not alone with my assessment
            > >that python has deficiencies w.r.t. SMP systems:[/color]
            >
            > That I won't argue. But Python's approach also has some benefits even
            > on SMP systems. And if you choose a multi-process approach, the same
            > advantages that accrue to Python's approach on a single-CPU box apply
            > just as much to an SMP system.[/color]

            Yes, and these advantages also include a simpler threading model, as
            far as I understand it, on every system. It's a compromise, that's why
            I posted in this thread.
            [color=blue]
            >[color=green]
            > >http://www.python.org/pycon/papers/deferex/
            > >"""
            > >It is optimal, however, to avoid requiring threads for any part of a
            > >framework. Threading has a significant cost, especially in Python. The
            > >global interpreter lock destroys any performance benefit that
            > >threading may yield on SMP systems, [...]
            > >"""[/color]
            >
            > Just because it's a published PyCon paper doesn't mean that it's correct.
            > The multi-threaded spider that I use as my example is a toy version of a
            > spider that was used on an SMP box. (That's why I became a threading
            > expert in the first place -- Tim Peters probably remembers me pestering
            > him with questions four years ago. ;-) I guarantee you that SMP made
            > that spider much faster.[/color]

            But how big is the significance of software which has the same
            characteristics as your web spider example versus application servers?
            [color=blue][color=green]
            > >So, although python is capable of taking advantage of SMP systems
            > >under certain circumstances (I/O bound systems etc. etc.), there are
            > >real world situations where python's performance is _hurt_ by running
            > >on a SMP system.[/color]
            >
            > Absolutely. But that's true of any system with threading that isn't
            > designed and tuned for the needs of a specific application. Python
            > trades performance in some situations for a clean and simple model of
            > threading.[/color]

            Again, the compromise we were talking about. I'm not in a position to
            weigh the pros and cons of it against each other, but I think I can
            point out some cons of the current approach. I'm not doing that to
            spread FUD, but to give an outsiders perspective on what I think might
            hurt python in the future, and I want python to thrive because I like
            using it alot.
            [color=blue][color=green]
            > >Btw. I think even IPC might not help you there, because the different
            > >processes might bounce betweeen CPUs, so only processor binding might
            > >help.[/color]
            >
            > My understanding that most OSes are designed to avoid this; I'd be
            > interested in seeing some information if I'm wrong. In any event, I do
            > know that IPC speeds things up in real-world applications on SMP boxes.[/color]

            For example, there are always lots of discussions about CPU affinity
            on linux-kernel, and it seems to be a hard problem. Hyperthreading and
            other non-symmetric architectures make this problem even harder.
            Add to that the problem of the GIL getting shuffled around and you
            have a system where you'll have trouble to predict the performance
            characteristics . Admins don't like that. Though, it's not like there
            are no problems without the GIL, it just adds to the complication.
            [color=blue][color=green]
            > >I did quite a bit of googling on this problem - several times -
            > >because I'm selling zope solutions. Sometimes, the client wants to run
            > >the solution on an existing SMP system, and worse, the system has to
            > >fulfill some performance requirements. Then I have the problem of
            > >explaining to him that his admins need to undertake some special tasks
            > >in order for zope to be able to exploit the multiple procs in his
            > >system.[/color]
            >
            > Even if Zope is the 800-pound gorilla of the Python world, Python isn't
            > going to change just for Zope. If you want to talk about ways of
            > improving Zope's performance on SMP boxes, I'll be glad to contribute
            > what I can. But spreading false information isn't the way to get me
            > interested.[/color]

            I wasn't even aware that zope is the "800-pound gorilla" of the python
            world. I used it just as an example for a typical larger server app,
            because, well, I know it.
            incidentally, the pycon paper above, which you seem to dismiss as
            false, is also from a guy which is working on a larger server app.
            Maybe there's a pattern?
            [color=blue]
            > Keep in mind that one reason IPC has gained popularity is because it
            > scales more than threading does, in the end. Blade servers are cheaper
            > than big SMP boxes, and IPC works across multiple computers.[/color]

            Allow me some comment of the nature of this discussion (python and SMP
            in general, not just this thread). I've seen it before and the
            ingredients are:

            - a major open source project
            - developers which love this project
            - some "outsider" which points out some perceived deficiency of said
            project
            - said developers pointing out (rightly or wrongly) reasons why this
            deficiency doesn't matter, or that there are other (better) ways for
            the "outsider" to achieve what he wants

            In most cases this discussion then develops in to a big fat flamewar
            ;).

            Two examples are linux and its threading capabilities, and mysql and
            ACID compliancy.
            A nice quote from the linux discussion btw. was from Alan Cox:

            "A Computer is a state machine. Threads are for people who can't
            program state machines."

            But today, linux' thread support is magnitudes better than it was.

            You wrote in another message in this thread:[color=blue]
            > Well, that's a good question. *Does* Java have better threading
            > performance than Python? If it does, to what extent is that performance
            > bought at the cost of complexity for the programmer?[/color]

            While I can't comment on the second question, here's an article which
            sheds some light on the SMP scalability of an older java JDK, the meat
            is on the third page:
            Business technology, IT news, product reviews and enterprise IT strategies.


            Seems that java does indeed have better threading performance than
            python.

            Comment

            • Robin Becker

              #66
              Re: Python's biggest compromises

              In article <3f36c43a$0$491 05$e4fe514c@new s.xs4all.nl>, Irmen de Jong
              <irmen@-NOSPAM-REMOVETHIS-xs4all.nl> writes[color=blue]
              >Robin Becker wrote:
              >[/color]
              ......[color=blue][color=green]
              >> I believe pyro can almost do that, but I haven't tried it.[/color]
              >
              >Could you please elaborate on this a bit?
              >What exactly did you have in mind when talking about
              >"migrating threads or tasklets" ?
              >[/color]

              Well I had in mind the grid concept, which I believe implies the
              distribution of code to multiple nodes and then the ability to execute
              on them (I suppose that includes re-sending data to already distributed
              instances).

              I imagine that a proper grid would allow reloading of modules as the
              overall application requires, but that would be relatively trivial if we
              could capture 'execution state'.

              Moving a running thread to another process would be fairly hard I
              imagine, but I guess that's what we want for load balancing etc.[color=blue]
              >Does this involve transporting code across nodes,
              >or only the 'execution' (and data)?
              >
              >Pyro supports transporting code, but with a few important limitations,
              >such as "once loaded, not reloaded".
              >
              >--Irmen de Jong
              >[/color]

              --
              Robin Becker

              Comment

              Working...