Byzantine Disk Paxos:
Optimal Resilience with Byzantine Shared Memory.

Authors: Ittai Abraham, Gregory Chockler, Idit Keidar, and Dahlia Malkhi.

In Distributed Computing, first online issue, November 2005, in print in Volume 18, Number 5, April 2006, Springer.

Previous version in the 23rd ACM Symposium on Principles of Distributed Computing (PODC '04), St. John's, Newfoundland, Canada, July 2004.

Abstract:

We present Byzantine Disk Paxos, an asynchronous shared-memory consensus algorithm that uses a collection of n > 3t disks, t of which may fail by becoming non-responsive or arbitrarily corrupted. We give two constructions of this algorithm; that is, we construct two different t-tolerant (i.e., tolerating up to t disk failures) building blocks, each of which can be used, along with a leader oracle, to solve consensus. One building block is a t-tolerant wait-free shared safe register. The second building block is a t-tolerant regular register that satisfies a weaker termination (liveness) condition than wait freedom: its write operations are wait-free, whereas its read operations are guaranteed to return only in executions with a finite number of writes. We call this termination condition finite writes (FW), and show that wait-free consensus is solvable with FW-terminating registers and a leader oracle. We construct each of these t-tolerant registers from n > 3t base registers, t of which can be non-responsive or Byzantine. All the previous t-tolerant wait-free constructions in this model used at least 4t+1 fault-prone registers, and we are not familiar with any prior FW-terminating constructions in this model.

We further show tight lower bounds on the number of invocation rounds required for optimal resilience reliable register constructions, or more generally, constructions that use less than 4t+1 fault-prone registers. Our lower bounds show that such constructions are inherently more costly than constructions that use 4t+1 registers, and that our constructions have optimal round complexity. Furthermore, our wait-free construction is early-stopping, and it achieves the optimal round complexity with any number of actual failures.

Download preprint of Distributed Computing paper: ps, ps.gz, pdf, pdf.gz.
Download PODC 2004 paper: ps, ps.gz, pdf, pdf.gz.


Last modified: Sun Apr 23 13:39:19 IDT 2006