Derecho: Fast state machine replication for cloud services

Sagar Jha, Jonathan Behrens, Theo Gkountouvas, Matthew Milano, Weijia Song, Edward Tremel, Robbert Van Renesse, Sydney Zink, Kenneth P. Birman

Research output: Contribution to journalArticlepeer-review

37 Scopus citations

Abstract

Cloud computing services often replicate data and may require ways to coordinate distributed actions. Here we present Derecho, a library for such tasks. The API provides interfaces for structuring applications into patterns of subgroups and shards, supports state machine replication within them, and includes mechanisms that assist in restart after failures. Running over 100Gbps RDMA, Derecho can send millions of events per second in each subgroup or shard and throughput peaks at 16GB/s, substantially outperforming prior solutions. Configured to run purely on TCP, Derecho is still substantially faster than comparable widely used, highly-tuned, standard tools. The key insight is that on modern hardware (including non-RDMA networks), data-intensive protocols should be built from non-blocking data-flow components.

Original languageEnglish (US)
Article number4
JournalACM Transactions on Computer Systems
Volume36
Issue number2
DOIs
StatePublished - 2019
Externally publishedYes

Keywords

  • Cloud computing
  • Consistency
  • Non-volatile memory
  • RDMA
  • Replication

ASJC Scopus subject areas

  • General Computer Science

Fingerprint

Dive into the research topics of 'Derecho: Fast state machine replication for cloud services'. Together they form a unique fingerprint.

Cite this