TY - JOUR

T1 - Parallel Algorithms with Processor Failures and Delays

AU - Buss, Jonathan F.

AU - Kanellakis, Paris C.

AU - Ragde, Prabhakar L.

AU - Shvartsman, Alex Allister

N1 - Funding Information:
We thank Jeff Vitter, Marc Snir, and Naomi Nishimura for helpful discussions, and Franco Preparata for reviewing an earlier draft of this paper. The research of J.B. was supported by NSERC Operating Grant OGP0009171, of P.C.K. by NSF Grant IRI-8617344 and ONR Grant N00014-91-J-1613, and of P.L.R. by NSERC Operating Grant OGP0041913. The major part of the work of A.A.S. was performed while at Brown University and at Digital Equipment Corporation.

PY - 1996/1

Y1 - 1996/1

N2 - We study efficient deterministic parallel algorithms on two models: restartable fail-stop CRCW PRAMs and asynchronous PRAMs. In the first model, synchronous processes are subject to arbitrary stop failures and restarts determined by an on-line adversary and involving loss of private but not shared memory; the complexity measures are completed work (where processors are charged for completed fixed-size update cycles) and overhead ratio (completed work amortized over necessary work and failures). In the second model, the result of the computation is a serialization of the actions of the processors determined by an on-line adversary; the complexity measure is total work (number of steps taken by all processors). Despite their differences, the two models share key algorithmic techniques. We present new algorithms for the Write-All problem (in which P processors write ones into an array of size N) for the two models. These algorithms can be used to implement a simulation strategy for any N processor PRAM on a restartable fail-stop P processor CRCW PRAM such that it guarantees a terminating execution of each simulated N processor step, with O(log2 N) overhead ratio, and O(min{N + P log2 N + M log N, N · P0.59}) (subquadratic) completed work (where M is the number of failures during this step's simulation). This strategy has a range of optimality. We also show that the Write-All requires N + Ω(P log P) completed/total work on these models for P ≤ N. & 1996 Academic Press, Inc.

AB - We study efficient deterministic parallel algorithms on two models: restartable fail-stop CRCW PRAMs and asynchronous PRAMs. In the first model, synchronous processes are subject to arbitrary stop failures and restarts determined by an on-line adversary and involving loss of private but not shared memory; the complexity measures are completed work (where processors are charged for completed fixed-size update cycles) and overhead ratio (completed work amortized over necessary work and failures). In the second model, the result of the computation is a serialization of the actions of the processors determined by an on-line adversary; the complexity measure is total work (number of steps taken by all processors). Despite their differences, the two models share key algorithmic techniques. We present new algorithms for the Write-All problem (in which P processors write ones into an array of size N) for the two models. These algorithms can be used to implement a simulation strategy for any N processor PRAM on a restartable fail-stop P processor CRCW PRAM such that it guarantees a terminating execution of each simulated N processor step, with O(log2 N) overhead ratio, and O(min{N + P log2 N + M log N, N · P0.59}) (subquadratic) completed work (where M is the number of failures during this step's simulation). This strategy has a range of optimality. We also show that the Write-All requires N + Ω(P log P) completed/total work on these models for P ≤ N. & 1996 Academic Press, Inc.

UR - http://www.scopus.com/inward/record.url?scp=0030360110&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=0030360110&partnerID=8YFLogxK

U2 - 10.1006/jagm.1996.0003

DO - 10.1006/jagm.1996.0003

M3 - Article

AN - SCOPUS:0030360110

VL - 20

SP - 45

EP - 86

JO - Journal of Algorithms

JF - Journal of Algorithms

SN - 0196-6774

IS - 1

ER -