when does the GIL really block?

This topic is closed.

Craig Allen
#1

when does the GIL really block?

Aug 1 '08, 01:35 AM

I have followed the GIL debate in python for some time. I don't want
to get into the regular debate about if it should be gotten rid of
(though I am curious about the status of that for Python 3)...
personally I think I can do multi-threaded programming well, but I
also see the benefits of a multiprocess approach. I'm not so
egotistical that I don't realize perhaps my mt programming has not
been "right" (though it worked and was debuggable) or more likely that
doing it right I have avoided even trying some things people want mt
programming to do... i.e. to do mt programming right you start to use
queues a lot, inter-thread asynchronous, non-blocking, communication,
which is essentially the multi-process approach, using IPC (except
that the threads can see the same memory when, in your special case,
you know that's ok. Given something like a reader-writer lock, this
can have benefits... but again, whatever.

My question is that given this problem, years ago before I started
writing in python I wrote some short programs in python which could,
in fact, busy both my CPUs. In retrospect I assume I did not have
code in my run function that causes a GIL lock... so I have done this
again.

I start two threads... I use gkrellm to watch my processors (dual
processor machine). If I merely print a number... both CPUS are
getting 90% simultaneous loads. If I increment a counter and print it
too, the same, and if I create a small list and sort it, the same. I
did not expect this... I expected to see one processor pegged at
around 100%, which should sometimes switch to the other processor.
Granted, the same program in C/C++ would peg both processors at
100%... but given that the overhead in the interpreter cannot explain
the extra usage, I assume the code in my thread's run functions is
actually executing non-serially.

I assume this is because what I am doing does not require the GIL to
be locked for a significant part of the time my code is running...
what code could I put in my run function to see the behavior I
expected? What code could I put there to take advantage of the
possibility that really the GIL is not locked enough to cause actual
serialization of the threads... anyone care to explain?
Tags: None
Rhamphoryncus
#2

Aug 1 '08, 10:15 PM

Re: when does the GIL really block?

On Jul 31, 7:27 pm, Craig Allen <callen...@gmai l.comwrote:

I have followed the GIL debate in python for some time. I don't want
to get into the regular debate about if it should be gotten rid of
(though I am curious about the status of that for Python 3)...
personally I think I can do multi-threaded programming well, but I
also see the benefits of a multiprocess approach. I'm not so
egotistical that I don't realize perhaps my mt programming has not
been "right" (though it worked and was debuggable) or more likely that
doing it right I have avoided even trying some things people want mt
programming to do... i.e. to do mt programming right you start to use
queues a lot, inter-threadasynchron ous, non-blocking, communication,
which is essentially the multi-process approach, using IPC (except
that thethreads can see the same memory when, in your special case,
you know that's ok. Given something like a reader-writer lock, this
can have benefits... but again, whatever.
>
My question is that given this problem, years ago before I started
writing in python I wrote some short programs in python which could,
in fact, busy both my CPUs. In retrospect I assume I did not have
code in my run function that causes a GIL lock... so I have done this
again.
>
I start twothreads... I use gkrellm to watch my processors (dual
processor machine). If I merely print a number... both CPUS are
getting 90% simultaneous loads. If I increment a counter and print it
too, the same, and if I create a small list and sort it, the same. I
did not expect this... I expected to see one processor pegged at
around 100%, which should sometimes switch to the other processor.
Granted, the same program in C/C++ would peg both processors at
100%... but given that the overhead in the interpreter cannot explain
the extra usage, I assume the code in mythread's run functions is
actually executing non-serially.

Try using sys.setcheckint erval(10000) (or even larger), overriding the
default of 100. This will reduce the locking overhead, which might by
why you see both CPUs as busy.

I assume this is because what I am doing does not require the GIL to
be locked for a significant part of the time my code is running...
what code could I put in my run function to see the behavior I
expected? What code could I put there to take advantage of the
possibility that really the GIL is not locked enough to cause actual
serialization of thethreads... anyone care to explain?

The GIL is locked during *all* access to the python interpreter.
There's nothing pure python code can do to avoid it - only a C
extension that doesn't access python could.
Comment
Craig Allen
#3

Aug 2 '08, 12:05 AM

Re: when does the GIL really block?

On Aug 1, 12:06 pm, Rhamphoryncus <rha...@gmail.c omwrote:

On Jul 31, 7:27 pm, Craig Allen <callen...@gmai l.comwrote:
>
>
>

I have followed the GIL debate in python for some time. I don't want
to get into the regular debate about if it should be gotten rid of
(though I am curious about the status of that for Python 3)...
personally I think I can do multi-threaded programming well, but I
also see the benefits of a multiprocess approach. I'm not so
egotistical that I don't realize perhaps my mt programming has not
been "right" (though it worked and was debuggable) or more likely that
doing it right I have avoided even trying some things people want mt
programming to do... i.e. to do mt programming right you start to use
queues a lot, inter-threadasynchron ous, non-blocking, communication,
which is essentially the multi-process approach, using IPC (except
that thethreads can see the same memory when, in your special case,
you know that's ok. Given something like a reader-writer lock, this
can have benefits... but again, whatever.

>

My question is that given this problem, years ago before I started
writing in python I wrote some short programs in python which could,
in fact, busy both my CPUs. In retrospect I assume I did not have
code in my run function that causes a GIL lock... so I have done this
again.

>

I start twothreads... I use gkrellm to watch my processors (dual
processor machine). If I merely print a number... both CPUS are
getting 90% simultaneous loads. If I increment a counter and print it
too, the same, and if I create a small list and sort it, the same. I
did not expect this... I expected to see one processor pegged at
around 100%, which should sometimes switch to the other processor.
Granted, the same program in C/C++ would peg both processors at
100%... but given that the overhead in the interpreter cannot explain
the extra usage, I assume the code in mythread's run functions is
actually executing non-serially.

>
Try using sys.setcheckint erval(10000) (or even larger), overriding the
default of 100. This will reduce the locking overhead, which might by
why you see both CPUs as busy.
>

I assume this is because what I am doing does not require the GIL to
be locked for a significant part of the time my code is running...
what code could I put in my run function to see the behavior I
expected? What code could I put there to take advantage of the
possibility that really the GIL is not locked enough to cause actual
serialization of thethreads... anyone care to explain?

>
The GIL is locked during *all* access to the python interpreter.
There's nothing pure python code can do to avoid it - only a C
extension that doesn't access python could.

thanks
Comment
John Krukoff
#4

Aug 2 '08, 12:35 AM

Re: when does the GIL really block?

On Thu, 2008-07-31 at 18:27 -0700, Craig Allen wrote:

I have followed the GIL debate in python for some time. I don't want
to get into the regular debate about if it should be gotten rid of
(though I am curious about the status of that for Python 3)...
personally I think I can do multi-threaded programming well, but I
also see the benefits of a multiprocess approach. I'm not so
egotistical that I don't realize perhaps my mt programming has not
been "right" (though it worked and was debuggable) or more likely that
doing it right I have avoided even trying some things people want mt
programming to do... i.e. to do mt programming right you start to use
queues a lot, inter-thread asynchronous, non-blocking, communication,
which is essentially the multi-process approach, using IPC (except
that the threads can see the same memory when, in your special case,
you know that's ok. Given something like a reader-writer lock, this
can have benefits... but again, whatever.
>
My question is that given this problem, years ago before I started
writing in python I wrote some short programs in python which could,
in fact, busy both my CPUs. In retrospect I assume I did not have
code in my run function that causes a GIL lock... so I have done this
again.
>
I start two threads... I use gkrellm to watch my processors (dual
processor machine). If I merely print a number... both CPUS are
getting 90% simultaneous loads. If I increment a counter and print it
too, the same, and if I create a small list and sort it, the same. I
did not expect this... I expected to see one processor pegged at
around 100%, which should sometimes switch to the other processor.
Granted, the same program in C/C++ would peg both processors at
100%... but given that the overhead in the interpreter cannot explain
the extra usage, I assume the code in my thread's run functions is
actually executing non-serially.
>
I assume this is because what I am doing does not require the GIL to
be locked for a significant part of the time my code is running...
what code could I put in my run function to see the behavior I
expected? What code could I put there to take advantage of the
possibility that really the GIL is not locked enough to cause actual
serialization of the threads... anyone care to explain?
--
http://mail.python.org/mailman/listinfo/python-list

It's worth mentioning that the most common place for the python
interpreter to release the GIL is during I/O, which printing a number to
the screen certainly counts as. You might try again with a set of loops
that only increment, and don't print, and you may more obviously see the
GIL in action.
--
John Krukoff <jkrukoff@ltgc. com>
Land Title Guarantee Company
Comment
Craig Allen
#5

Aug 5 '08, 02:55 AM

Re: when does the GIL really block?

On Aug 1, 2:28 pm, John Krukoff <jkruk...@ltgc. comwrote:

On Thu, 2008-07-31 at 18:27 -0700, Craig Allen wrote:

I have followed the GIL debate in python for some time. I don't want
to get into the regular debate about if it should be gotten rid of
(though I am curious about the status of that for Python 3)...
personally I think I can do multi-threaded programming well, but I
also see the benefits of a multiprocess approach. I'm not so
egotistical that I don't realize perhaps my mt programming has not
been "right" (though it worked and was debuggable) or more likely that
doing it right I have avoided even trying some things people want mt
programming to do... i.e. to do mt programming right you start to use
queues a lot, inter-thread asynchronous, non-blocking, communication,
which is essentially the multi-process approach, using IPC (except
that the threads can see the same memory when, in your special case,
you know that's ok. Given something like a reader-writer lock, this
can have benefits... but again, whatever.

>

My question is that given this problem, years ago before I started
writing in python I wrote some short programs in python which could,
in fact, busy both my CPUs. In retrospect I assume I did not have
code in my run function that causes a GIL lock... so I have done this
again.

>

I start two threads... I use gkrellm to watch my processors (dual
processor machine). If I merely print a number... both CPUS are
getting 90% simultaneous loads. If I increment a counter and print it
too, the same, and if I create a small list and sort it, the same. I
did not expect this... I expected to see one processor pegged at
around 100%, which should sometimes switch to the other processor.
Granted, the same program in C/C++ would peg both processors at
100%... but given that the overhead in the interpreter cannot explain
the extra usage, I assume the code in my thread's run functions is
actually executing non-serially.

>

I assume this is because what I am doing does not require the GIL to
be locked for a significant part of the time my code is running...
what code could I put in my run function to see the behavior I
expected? What code could I put there to take advantage of the
possibility that really the GIL is not locked enough to cause actual
serialization of the threads... anyone care to explain?
--
http://mail.python.org/mailman/listinfo/python-list

>
It's worth mentioning that the most common place for the python
interpreter to release the GIL is during I/O, which printing a number to
the screen certainly counts as. You might try again with a set of loops
that only increment, and don't print, and you may more obviously see the
GIL in action.
--
John Krukoff <jkruk...@ltgc. com>
Land Title Guarantee Company

thanks, good idea, I think I'll try that.
Comment

Previous template Next

when does the GIL really block?

when does the GIL really block?

Comment

Comment

Comment

Comment