The impossibility of boosting distributed service resilience

Paul Attie; Rachid Guerraoui; Petr Kouznetsov; Nancy Lynch; Sergio Rajsbaum

The impossibility of boosting distributed service resilience

Paul Attie, Rachid Guerraoui, Petr Kouznetsov, Nancy Lynch, Sergio Rajsbaum

Research output: Contribution to conference › Paper › peer-review

Abstract

We prove two theorems saying that no distributed system in which processes coordinate using reliable registers and f-resilient services can solve the consensus problem in the presence of f + 1 undetectable process stopping failures. (A service is f-resilient if it is guaranteed to operate as long as no more than f of the processes connected to it fail.) Our first theorem assumes that the given services are atomic objects, and allows any connection pattern between processes and services. In contrast, we show that it is possible to boost the resilience of systems solving problems easier than consensus: the k-set consensus problem is solvable for 2k - 1 failures using 1-resilient consensus services. The first theorem and its proof generalize to the larger class of failure-oblivious services. Our second theorem allows the system to contain failureaware services, such as failure detectors, in addition to failure-oblivious services; however, it requires that each failure-aware service be connected to all processes. Thus, f + 1 process failures overall can disable all the failure-aware services. In contrast, it is possible to boost the resilience of a system solving consensus if arbitrary patterns of connectivity are allowed between processes and failure-aware services: consensus is solvable for any number of failures using only 1-resilient 2-process perfect failure detectors.

Original language	English (US)
Pages	39-48
Number of pages	10
State	Published - 2005
Externally published	Yes
Event	25th IEEE International Conference on Distributed Computing Systems - Columbus, OH, United States Duration: Jun 6 2005 → Jun 10 2005

Conference

Conference	25th IEEE International Conference on Distributed Computing Systems
Country/Territory	United States
City	Columbus, OH
Period	6/6/05 → 6/10/05

ASJC Scopus subject areas

Software
Hardware and Architecture
Computer Networks and Communications

Cite this

@conference{6446cbde321f4a329e2400e8692687c0,

title = "The impossibility of boosting distributed service resilience",

abstract = "We prove two theorems saying that no distributed system in which processes coordinate using reliable registers and f-resilient services can solve the consensus problem in the presence of f + 1 undetectable process stopping failures. (A service is f-resilient if it is guaranteed to operate as long as no more than f of the processes connected to it fail.) Our first theorem assumes that the given services are atomic objects, and allows any connection pattern between processes and services. In contrast, we show that it is possible to boost the resilience of systems solving problems easier than consensus: the k-set consensus problem is solvable for 2k - 1 failures using 1-resilient consensus services. The first theorem and its proof generalize to the larger class of failure-oblivious services. Our second theorem allows the system to contain failureaware services, such as failure detectors, in addition to failure-oblivious services; however, it requires that each failure-aware service be connected to all processes. Thus, f + 1 process failures overall can disable all the failure-aware services. In contrast, it is possible to boost the resilience of a system solving consensus if arbitrary patterns of connectivity are allowed between processes and failure-aware services: consensus is solvable for any number of failures using only 1-resilient 2-process perfect failure detectors.",

author = "Paul Attie and Rachid Guerraoui and Petr Kouznetsov and Nancy Lynch and Sergio Rajsbaum",

year = "2005",

language = "English (US)",

pages = "39--48",

note = "25th IEEE International Conference on Distributed Computing Systems ; Conference date: 06-06-2005 Through 10-06-2005",

}

TY - CONF

T1 - The impossibility of boosting distributed service resilience

AU - Attie, Paul

AU - Guerraoui, Rachid

AU - Kouznetsov, Petr

AU - Lynch, Nancy

AU - Rajsbaum, Sergio

PY - 2005

Y1 - 2005

N2 - We prove two theorems saying that no distributed system in which processes coordinate using reliable registers and f-resilient services can solve the consensus problem in the presence of f + 1 undetectable process stopping failures. (A service is f-resilient if it is guaranteed to operate as long as no more than f of the processes connected to it fail.) Our first theorem assumes that the given services are atomic objects, and allows any connection pattern between processes and services. In contrast, we show that it is possible to boost the resilience of systems solving problems easier than consensus: the k-set consensus problem is solvable for 2k - 1 failures using 1-resilient consensus services. The first theorem and its proof generalize to the larger class of failure-oblivious services. Our second theorem allows the system to contain failureaware services, such as failure detectors, in addition to failure-oblivious services; however, it requires that each failure-aware service be connected to all processes. Thus, f + 1 process failures overall can disable all the failure-aware services. In contrast, it is possible to boost the resilience of a system solving consensus if arbitrary patterns of connectivity are allowed between processes and failure-aware services: consensus is solvable for any number of failures using only 1-resilient 2-process perfect failure detectors.

AB - We prove two theorems saying that no distributed system in which processes coordinate using reliable registers and f-resilient services can solve the consensus problem in the presence of f + 1 undetectable process stopping failures. (A service is f-resilient if it is guaranteed to operate as long as no more than f of the processes connected to it fail.) Our first theorem assumes that the given services are atomic objects, and allows any connection pattern between processes and services. In contrast, we show that it is possible to boost the resilience of systems solving problems easier than consensus: the k-set consensus problem is solvable for 2k - 1 failures using 1-resilient consensus services. The first theorem and its proof generalize to the larger class of failure-oblivious services. Our second theorem allows the system to contain failureaware services, such as failure detectors, in addition to failure-oblivious services; however, it requires that each failure-aware service be connected to all processes. Thus, f + 1 process failures overall can disable all the failure-aware services. In contrast, it is possible to boost the resilience of a system solving consensus if arbitrary patterns of connectivity are allowed between processes and failure-aware services: consensus is solvable for any number of failures using only 1-resilient 2-process perfect failure detectors.

UR - http://www.scopus.com/inward/record.url?scp=27944465381&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=27944465381&partnerID=8YFLogxK

M3 - Paper

AN - SCOPUS:27944465381

SP - 39

EP - 48

T2 - 25th IEEE International Conference on Distributed Computing Systems

Y2 - 6 June 2005 through 10 June 2005

ER -

The impossibility of boosting distributed service resilience

Abstract

Conference

ASJC Scopus subject areas

Other files and links

Fingerprint

Cite this