Technical Report: Estimating Reliability of Workers for Cooperative Distributed Computing

Research output: Contribution to journalArticlepeer-review

Abstract

Internet supercomputing is an approach to solving partitionable, computation-intensive problems by harnessing the power of a vast number of interconnected computers. For the problem of using network supercomputing to perform a large collection of independent tasks, prior work introduced a decentralized approach and provided randomized synchronous algorithms that perform all tasks correctly with high probability, while dealing with misbehaving or crash-prone processors. The main weaknesses of existing algorithms is that they assume either that the \emph{average} probability of a non-crashed processor returning incorrect results is inferior to $\frac{1}{2}$, or that the probability of returning incorrect results is known to \emph{each} processor. Here we present a randomized synchronous distributed algorithm that tightly estimates the probability of each processor returning correct results. Starting with the set $P$ of $n$ processors, let $F$ be the set of processors that crash. Our algorithm estimates the probability $p_i$ of returning a corre ct result for each processor $i \in P-F$, making the estimates available to all these processors. The estimation is based on the $(\epsilon, \delta)$-approximation, where each estimated probability $\tilde{p_i}$ of $p_i$ obeys the bound ${\sf Pr}[p_i(1-\epsilon) \leq \tilde{p_i} \leq p_i(1+\epsilon) ] > 1 - \delta$, for any constants $\delta > 0$ and $\epsilon > 0$ chosen by the user. An important aspect of this algorithm is that each processor terminates without global coordination. We assess the efficiency of the algorithm in three adversarial models as follows. For the model where the number of non-crashed processors $|P-F|$ is linearly bounded the time complexity $T(n)$ of the algorithm is $\Theta(\log{n})$, work complexity $W(n)$ is $\Theta(n\log{n})$, and message complexity $M(n)$ is $\Theta(n\log^2n)$.
Original languageUndefined
JournalCoRR
Volumeabs/1407.0696
StatePublished - Jul 2 2014

Cite this