when does the GIL really block?

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • Craig Allen

    when does the GIL really block?

    I have followed the GIL debate in python for some time. I don't want
    to get into the regular debate about if it should be gotten rid of
    (though I am curious about the status of that for Python 3)...
    personally I think I can do multi-threaded programming well, but I
    also see the benefits of a multiprocess approach. I'm not so
    egotistical that I don't realize perhaps my mt programming has not
    been "right" (though it worked and was debuggable) or more likely that
    doing it right I have avoided even trying some things people want mt
    programming to do... i.e. to do mt programming right you start to use
    queues a lot, inter-thread asynchronous, non-blocking, communication,
    which is essentially the multi-process approach, using IPC (except
    that the threads can see the same memory when, in your special case,
    you know that's ok. Given something like a reader-writer lock, this
    can have benefits... but again, whatever.

    My question is that given this problem, years ago before I started
    writing in python I wrote some short programs in python which could,
    in fact, busy both my CPUs. In retrospect I assume I did not have
    code in my run function that causes a GIL lock... so I have done this
    again.

    I start two threads... I use gkrellm to watch my processors (dual
    processor machine). If I merely print a number... both CPUS are
    getting 90% simultaneous loads. If I increment a counter and print it
    too, the same, and if I create a small list and sort it, the same. I
    did not expect this... I expected to see one processor pegged at
    around 100%, which should sometimes switch to the other processor.
    Granted, the same program in C/C++ would peg both processors at
    100%... but given that the overhead in the interpreter cannot explain
    the extra usage, I assume the code in my thread's run functions is
    actually executing non-serially.

    I assume this is because what I am doing does not require the GIL to
    be locked for a significant part of the time my code is running...
    what code could I put in my run function to see the behavior I
    expected? What code could I put there to take advantage of the
    possibility that really the GIL is not locked enough to cause actual
    serialization of the threads... anyone care to explain?
  • Rhamphoryncus

    #2
    Re: when does the GIL really block?

    On Jul 31, 7:27 pm, Craig Allen <callen...@gmai l.comwrote:
    I have followed the GIL debate in python for some time.  I don't want
    to get into the regular debate about if it should be gotten rid of
    (though I am curious about the status of that for Python 3)...
    personally I think I can do multi-threaded programming well, but I
    also see the benefits of a multiprocess approach. I'm not so
    egotistical that I don't realize perhaps my mt programming has not
    been "right" (though it worked and was debuggable) or more likely that
    doing it right I have avoided even trying some things people want mt
    programming to do... i.e. to do mt programming right you start to use
    queues a lot, inter-threadasynchron ous, non-blocking, communication,
    which is essentially the multi-process approach, using IPC (except
    that thethreads can see the same memory when, in your special case,
    you know that's ok. Given something like a reader-writer lock, this
    can have benefits... but again, whatever.
    >
    My question is that given this problem, years ago before I started
    writing in python I wrote some short programs in python which could,
    in fact, busy both my CPUs.  In retrospect I assume I did not have
    code in my run function that causes a GIL lock... so I have done this
    again.
    >
    I start twothreads... I use gkrellm to watch my processors (dual
    processor machine).  If I merely print a number... both CPUS are
    getting 90% simultaneous loads. If I increment a counter and print it
    too, the same, and if I create a small list and sort it, the same. I
    did not expect this... I expected to see one processor pegged at
    around 100%, which should sometimes switch to the other processor.
    Granted, the same program in C/C++ would peg both processors at
    100%... but given that the overhead in the interpreter cannot explain
    the extra usage, I assume the code in mythread's run functions is
    actually executing non-serially.
    Try using sys.setcheckint erval(10000) (or even larger), overriding the
    default of 100. This will reduce the locking overhead, which might by
    why you see both CPUs as busy.

    I assume this is because what I am doing does not require the GIL to
    be locked for a significant part of the time my code is running...
    what code could I put in my run function to see the behavior I
    expected?  What code could I put there to take advantage of the
    possibility that really the GIL is not locked enough to cause actual
    serialization of thethreads...  anyone care to explain?
    The GIL is locked during *all* access to the python interpreter.
    There's nothing pure python code can do to avoid it - only a C
    extension that doesn't access python could.

    Comment

    • Craig Allen

      #3
      Re: when does the GIL really block?

      On Aug 1, 12:06 pm, Rhamphoryncus <rha...@gmail.c omwrote:
      On Jul 31, 7:27 pm, Craig Allen <callen...@gmai l.comwrote:
      >
      >
      >
      I have followed the GIL debate in python for some time. I don't want
      to get into the regular debate about if it should be gotten rid of
      (though I am curious about the status of that for Python 3)...
      personally I think I can do multi-threaded programming well, but I
      also see the benefits of a multiprocess approach. I'm not so
      egotistical that I don't realize perhaps my mt programming has not
      been "right" (though it worked and was debuggable) or more likely that
      doing it right I have avoided even trying some things people want mt
      programming to do... i.e. to do mt programming right you start to use
      queues a lot, inter-threadasynchron ous, non-blocking, communication,
      which is essentially the multi-process approach, using IPC (except
      that thethreads can see the same memory when, in your special case,
      you know that's ok. Given something like a reader-writer lock, this
      can have benefits... but again, whatever.
      >
      My question is that given this problem, years ago before I started
      writing in python I wrote some short programs in python which could,
      in fact, busy both my CPUs. In retrospect I assume I did not have
      code in my run function that causes a GIL lock... so I have done this
      again.
      >
      I start twothreads... I use gkrellm to watch my processors (dual
      processor machine). If I merely print a number... both CPUS are
      getting 90% simultaneous loads. If I increment a counter and print it
      too, the same, and if I create a small list and sort it, the same. I
      did not expect this... I expected to see one processor pegged at
      around 100%, which should sometimes switch to the other processor.
      Granted, the same program in C/C++ would peg both processors at
      100%... but given that the overhead in the interpreter cannot explain
      the extra usage, I assume the code in mythread's run functions is
      actually executing non-serially.
      >
      Try using sys.setcheckint erval(10000) (or even larger), overriding the
      default of 100. This will reduce the locking overhead, which might by
      why you see both CPUs as busy.
      >
      I assume this is because what I am doing does not require the GIL to
      be locked for a significant part of the time my code is running...
      what code could I put in my run function to see the behavior I
      expected? What code could I put there to take advantage of the
      possibility that really the GIL is not locked enough to cause actual
      serialization of thethreads... anyone care to explain?
      >
      The GIL is locked during *all* access to the python interpreter.
      There's nothing pure python code can do to avoid it - only a C
      extension that doesn't access python could.
      thanks

      Comment

      • John Krukoff

        #4
        Re: when does the GIL really block?


        On Thu, 2008-07-31 at 18:27 -0700, Craig Allen wrote:
        I have followed the GIL debate in python for some time. I don't want
        to get into the regular debate about if it should be gotten rid of
        (though I am curious about the status of that for Python 3)...
        personally I think I can do multi-threaded programming well, but I
        also see the benefits of a multiprocess approach. I'm not so
        egotistical that I don't realize perhaps my mt programming has not
        been "right" (though it worked and was debuggable) or more likely that
        doing it right I have avoided even trying some things people want mt
        programming to do... i.e. to do mt programming right you start to use
        queues a lot, inter-thread asynchronous, non-blocking, communication,
        which is essentially the multi-process approach, using IPC (except
        that the threads can see the same memory when, in your special case,
        you know that's ok. Given something like a reader-writer lock, this
        can have benefits... but again, whatever.
        >
        My question is that given this problem, years ago before I started
        writing in python I wrote some short programs in python which could,
        in fact, busy both my CPUs. In retrospect I assume I did not have
        code in my run function that causes a GIL lock... so I have done this
        again.
        >
        I start two threads... I use gkrellm to watch my processors (dual
        processor machine). If I merely print a number... both CPUS are
        getting 90% simultaneous loads. If I increment a counter and print it
        too, the same, and if I create a small list and sort it, the same. I
        did not expect this... I expected to see one processor pegged at
        around 100%, which should sometimes switch to the other processor.
        Granted, the same program in C/C++ would peg both processors at
        100%... but given that the overhead in the interpreter cannot explain
        the extra usage, I assume the code in my thread's run functions is
        actually executing non-serially.
        >
        I assume this is because what I am doing does not require the GIL to
        be locked for a significant part of the time my code is running...
        what code could I put in my run function to see the behavior I
        expected? What code could I put there to take advantage of the
        possibility that really the GIL is not locked enough to cause actual
        serialization of the threads... anyone care to explain?
        --
        http://mail.python.org/mailman/listinfo/python-list
        It's worth mentioning that the most common place for the python
        interpreter to release the GIL is during I/O, which printing a number to
        the screen certainly counts as. You might try again with a set of loops
        that only increment, and don't print, and you may more obviously see the
        GIL in action.
        --
        John Krukoff <jkrukoff@ltgc. com>
        Land Title Guarantee Company

        Comment

        • Craig Allen

          #5
          Re: when does the GIL really block?

          On Aug 1, 2:28 pm, John Krukoff <jkruk...@ltgc. comwrote:
          On Thu, 2008-07-31 at 18:27 -0700, Craig Allen wrote:
          I have followed the GIL debate in python for some time. I don't want
          to get into the regular debate about if it should be gotten rid of
          (though I am curious about the status of that for Python 3)...
          personally I think I can do multi-threaded programming well, but I
          also see the benefits of a multiprocess approach. I'm not so
          egotistical that I don't realize perhaps my mt programming has not
          been "right" (though it worked and was debuggable) or more likely that
          doing it right I have avoided even trying some things people want mt
          programming to do... i.e. to do mt programming right you start to use
          queues a lot, inter-thread asynchronous, non-blocking, communication,
          which is essentially the multi-process approach, using IPC (except
          that the threads can see the same memory when, in your special case,
          you know that's ok. Given something like a reader-writer lock, this
          can have benefits... but again, whatever.
          >
          My question is that given this problem, years ago before I started
          writing in python I wrote some short programs in python which could,
          in fact, busy both my CPUs. In retrospect I assume I did not have
          code in my run function that causes a GIL lock... so I have done this
          again.
          >
          I start two threads... I use gkrellm to watch my processors (dual
          processor machine). If I merely print a number... both CPUS are
          getting 90% simultaneous loads. If I increment a counter and print it
          too, the same, and if I create a small list and sort it, the same. I
          did not expect this... I expected to see one processor pegged at
          around 100%, which should sometimes switch to the other processor.
          Granted, the same program in C/C++ would peg both processors at
          100%... but given that the overhead in the interpreter cannot explain
          the extra usage, I assume the code in my thread's run functions is
          actually executing non-serially.
          >
          I assume this is because what I am doing does not require the GIL to
          be locked for a significant part of the time my code is running...
          what code could I put in my run function to see the behavior I
          expected? What code could I put there to take advantage of the
          possibility that really the GIL is not locked enough to cause actual
          serialization of the threads... anyone care to explain?
          --
          http://mail.python.org/mailman/listinfo/python-list
          >
          It's worth mentioning that the most common place for the python
          interpreter to release the GIL is during I/O, which printing a number to
          the screen certainly counts as. You might try again with a set of loops
          that only increment, and don't print, and you may more obviously see the
          GIL in action.
          --
          John Krukoff <jkruk...@ltgc. com>
          Land Title Guarantee Company
          thanks, good idea, I think I'll try that.

          Comment

          Working...