[BOOK][B] Introduction to parallel processing: algorithms and architectures
B Parhami - 1999 - books.google.com
… significant simplification of the design process. However, two major roadblocks have thus
far … presented at several annual conferences, known as FMPC, ICPP, IPPS, SPAA, SPDP (now …
far … presented at several annual conferences, known as FMPC, ICPP, IPPS, SPAA, SPDP (now …
The Cilk++ concurrency platform
CE Leiserson - Proceedings of the 46th Annual Design Automation …, 2009 - dl.acm.org
… Specifically, two parallel instantiations of walk may attempt to update the shared global
variable output list in parallel at line 10. The traditional solution to fixing this kind of data race is to …
variable output list in parallel at line 10. The traditional solution to fixing this kind of data race is to …
Thread scheduling for multiprogrammed multiprocessors
NS Arora, RD Blumofe, CG Plaxton - … annual ACM symposium on Parallel …, 1998 - dl.acm.org
… SPAA 98 Puerto Vallarta Mexico Copyright ACM 1998 O-89791-989-0/98/ 6...$5.00 … the
“bounded tags” algorithm [28]. We claim that the deque implementation presented above is 122 …
“bounded tags” algorithm [28]. We claim that the deque implementation presented above is 122 …
Scheduling multithreaded computations by work stealing
RD Blumofe, CE Leiserson - Journal of the ACM (JACM), 1999 - dl.acm.org
… Because multithreaded computations with arbitrary dependencies can be impossible to
schedule efficiently [Blumofe and Leiserson 1998], we study subclasses of general …
schedule efficiently [Blumofe and Leiserson 1998], we study subclasses of general …
The implementation of the Cilk-5 multithreaded language
M Frigo, CE Leiserson, KH Randall - … ACM SIGPLAN 1998 conference …, 1998 - dl.acm.org
… , ample “parallel slackness” [28] … two clones of each procedure--a fast clone and a slow
clone. The fast clone operates much as does the C elision and has little support for parallelism. …
clone. The fast clone operates much as does the C elision and has little support for parallelism. …
The data locality of work stealing
UA Acar, GE Blelloch, RD Blumofe - … annual ACM symposium on Parallel …, 2000 - dl.acm.org
… stealing algorithm, define seriesparallel and nested-parallel computations … 6, 9] we represent
a multithreaded computation as a directed acyclic graph, a dag, of instructions (see Figure 2…
a multithreaded computation as a directed acyclic graph, a dag, of instructions (see Figure 2…
Kendo: efficient deterministic multithreading in software
… To help introduce the reader to our deterministic locking algorithm, we present two versions.
The first, presented in Section 3.2.1, is a simplified algorithm that does not support nested …
The first, presented in Section 3.2.1, is a simplified algorithm that does not support nested …
[PDF][PDF] Fast set operations using treaps
GE Blelloch, M Reid-Miller - … tenth annual ACM symposium on Parallel …, 1998 - dl.acm.org
… SPAA 98 Puerto Vallarta Mexico Copyright ACM 1998 O-89791-989-0/98/ 6...$5.00 … is
about three times the overhead of a C function call [28]. In Figure 7 we show the times for union …
about three times the overhead of a C function call [28]. In Figure 7 we show the times for union …
Designing irregular parallel algorithms with mutual exclusion and lock-free protocols
… -theoretic algorithms for illustrative purposes, we show experimental results on two shared-…
parallel algorithms with efficient fine-grained synchronization may yield good performance. …
parallel algorithms with efficient fine-grained synchronization may yield good performance. …
Detecting data races in Cilk programs that use locks
GI Cheng, M Feng, CE Leiserson, KH Randall… - … symposium on Parallel …, 1998 - dl.acm.org
… SPAA 98 Puerto Vallarta Mexico Copyright ACM 1998 O-89791-989-0/98/ … of Netzer and
Miller [28], upparent data races-those that appear to occur in a computation according to the …
Miller [28], upparent data races-those that appear to occur in a computation according to the …