TY - JOUR

T1 - Efficient gossip and robust distributed computation

AU - Georgiou, Chryssis

AU - Kowalski, Dariusz R.

AU - Shvartsman, Alex A.

N1 - Funding Information:
This research was supported by the NSF Grants 9988304, 0121277, and 0311368. The work of the first author was performed in part at the University of Connecticut. The work of the second author was supported by the NSF-NATO Award 0209588 and by KBN grant 4T11C04425. The work of the third author was supported in part by the NSF CAREER Award 9984778. A preliminary version of this work appears as [12]. ∗Corresponding author. E-mail addresses: chryssis@ucy.ac.cy (C. Georgiou), darek@mimuw.edu.pl (D.R. Kowalski), aas@cse.uconn.edu (A.A. Shvartsman).

PY - 2003/12/1

Y1 - 2003/12/1

N2 - This paper presents an efficient deterministic gossip algorithm for p synchronous, crash-prone, message-passing processors. The algorithm has time complexity T = O(log2 p) and message complexity M = O(p1+ε), for any ε > 0. This substantially improves the message complexity of the previous best algorithm that has M = O(p1.77), while maintaining the same time complexity. The strength of the new algorithm is demonstrated by constructing a deterministic algorithm for performing n tasks in this distributed setting. Previous solutions used coordinator or check-pointing approaches, immediately incurring a work penalty Ω(n + f·p) for f crashes, or relied on strong communication primitives, such as reliable broadcast, or had work too close to the trivial Θ(p·n) bound of oblivious algorithms. The new algorithm uses p crash-prone processors to perform n similar and idempotent tasks so long as one processor remains active. The work of the algorithm is W = O(n + p·min{f + 1, log3 p}) and its message complexity is M = O(fpε + pmin{f + 1, log p}), for any ε > 0. This substantially improves the work complexity of previous solutions using simple point-to-point messaging, while "meeting or beating" the corresponding message complexity bounds. The new algorithms use communication graphs and permutations with certain combinatorial properties that are shown to exist. The algorithms are correct for any permutations, and in particular, the same expected bounds can be achieved using random permutations.

AB - This paper presents an efficient deterministic gossip algorithm for p synchronous, crash-prone, message-passing processors. The algorithm has time complexity T = O(log2 p) and message complexity M = O(p1+ε), for any ε > 0. This substantially improves the message complexity of the previous best algorithm that has M = O(p1.77), while maintaining the same time complexity. The strength of the new algorithm is demonstrated by constructing a deterministic algorithm for performing n tasks in this distributed setting. Previous solutions used coordinator or check-pointing approaches, immediately incurring a work penalty Ω(n + f·p) for f crashes, or relied on strong communication primitives, such as reliable broadcast, or had work too close to the trivial Θ(p·n) bound of oblivious algorithms. The new algorithm uses p crash-prone processors to perform n similar and idempotent tasks so long as one processor remains active. The work of the algorithm is W = O(n + p·min{f + 1, log3 p}) and its message complexity is M = O(fpε + pmin{f + 1, log p}), for any ε > 0. This substantially improves the work complexity of previous solutions using simple point-to-point messaging, while "meeting or beating" the corresponding message complexity bounds. The new algorithms use communication graphs and permutations with certain combinatorial properties that are shown to exist. The algorithms are correct for any permutations, and in particular, the same expected bounds can be achieved using random permutations.

KW - Combinatorial tools

KW - Distributed algorithms

KW - Gossip

KW - Performing work

KW - Processor failures

UR - http://www.scopus.com/inward/record.url?scp=35248829130&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=35248829130&partnerID=8YFLogxK

U2 - 10.1016/j.tcs.2005.05.019

DO - 10.1016/j.tcs.2005.05.019

M3 - Article

AN - SCOPUS:27844559682

VL - 2848

SP - 224

EP - 238

JO - Lecture Notes in Computer Science

JF - Lecture Notes in Computer Science

SN - 0302-9743

IS - 1-2

ER -