[comp.lang.c++ added...
]
"Dmitriy V'jukov" <dvyukov@gmail. comwrote in message
news:4306b243-2d28-4ec6-a275-883db633b9b8@l4 2g2000hsc.googl egroups.com...
On Sep 29, 4:23 pm, "Chris M. Thomasson" <n...@spam.inva lidwrote:
You mean:
1 atomic RMW and 1 StoreLoad membar for lock
1 atomic RMW and 1 LoadStore membar for unlock
2 atomics, and 2 membars; fairly expensive... Well... You indeed have a good
point in the case that stealing is rare...
]
"Dmitriy V'jukov" <dvyukov@gmail. comwrote in message
news:4306b243-2d28-4ec6-a275-883db633b9b8@l4 2g2000hsc.googl egroups.com...
On Sep 29, 4:23 pm, "Chris M. Thomasson" <n...@spam.inva lidwrote:
>
How does the internal low-level impl compare to the following:
I am looking for raw pseudo-code for atomic deque internal impl
details...
AFAICT, this work from SUN would scale better than Clik. Please correct
me
if I am way off base here. It seems like spawning a successor thread has
overheads... Humm. Pleas try to bear with me here; okay? Correct my
ignorance on Clik's work-stealing internal impl... Well, let me pick an
impl
to focus on... Say, DEC Alpha?
>Are you using something like the algorithm in TBB?
TBB is based on MIT's Cilk work:
>http://www.cilk.com/multicore-blog/b...ost-Influentia...
>http://www.cilk.com/multicore-blog/b...ost-Influentia...
I am looking for raw pseudo-code for atomic deque internal impl
details...
AFAICT, this work from SUN would scale better than Clik. Please correct
me
if I am way off base here. It seems like spawning a successor thread has
overheads... Humm. Pleas try to bear with me here; okay? Correct my
ignorance on Clik's work-stealing internal impl... Well, let me pick an
impl
to focus on... Say, DEC Alpha?
AFAIK, in early days Cilk's work-stealing deque used mutex-based
pop(). But I remember there was some mentions of non-blocking
algorithms in the Cilk's papers, something like "some people point us
that it's possible to implement work-stealing deque in completely non-
blocking manner". And I don't know whether non-blocking deque was
finally incorporated into Cilk.
pop(). But I remember there was some mentions of non-blocking
algorithms in the Cilk's papers, something like "some people point us
that it's possible to implement work-stealing deque in completely non-
blocking manner". And I don't know whether non-blocking deque was
finally incorporated into Cilk.
If mutex is spin-mutex (i.e. there is only 1 atomic RMW per lock/
unlock) and stealing is rare, then mutex-based deque is nearly the
same as non-blocking deque with 1 RMW... provided that push() doesn't
use mutex. And provided that atomic RMW has the same cost as StoreLoad
memory fence (x86).
unlock) and stealing is rare, then mutex-based deque is nearly the
same as non-blocking deque with 1 RMW... provided that push() doesn't
use mutex. And provided that atomic RMW has the same cost as StoreLoad
memory fence (x86).
1 atomic RMW and 1 StoreLoad membar for lock
1 atomic RMW and 1 LoadStore membar for unlock
2 atomics, and 2 membars; fairly expensive... Well... You indeed have a good
point in the case that stealing is rare...
Comment