Byzantine Disk Paxos:
Optimal Resilience with Byzantine Shared Memory.
Authors:
Ittai Abraham,
Gregory Chockler,
Idit Keidar,
and
Dahlia Malkhi.
In Distributed Computing, first online issue,
November 2005, in print in Volume 18, Number 5, April 2006,
Springer.
Previous version in the 23rd ACM
Symposium on Principles of Distributed Computing (PODC '04),
St. John's, Newfoundland, Canada, July 2004.
Abstract:
We present Byzantine Disk Paxos, an asynchronous shared-memory
consensus algorithm that uses a collection of n > 3t disks,
t of
which may fail by becoming non-responsive or arbitrarily corrupted. We
give two constructions of this algorithm; that is, we construct two
different t-tolerant
(i.e., tolerating up to t disk failures)
building blocks, each of which can be used, along with a leader
oracle, to solve consensus. One building block is a t-tolerant
wait-free shared safe register. The second building block is a
t-tolerant regular register that satisfies a weaker termination
(liveness) condition than wait freedom: its write operations are
wait-free, whereas its read operations are guaranteed to return only
in executions with a finite number of writes. We call this
termination condition finite writes (FW), and show that
wait-free consensus is solvable with FW-terminating
registers and a leader oracle. We construct each of these t-tolerant
registers from n > 3t base registers, t of which can be
non-responsive or Byzantine. All the previous t-tolerant wait-free
constructions in this model used at least 4t+1 fault-prone
registers, and we are not familiar with any prior FW-terminating
constructions in this model.
We further show tight lower bounds on the number of invocation rounds
required for optimal resilience reliable register constructions, or
more generally, constructions that use less than 4t+1
fault-prone registers. Our lower bounds show that such constructions
are inherently more costly than constructions that use 4t+1
registers, and that our constructions have optimal round
complexity. Furthermore, our wait-free construction is
early-stopping, and it achieves the optimal round complexity
with any number of actual failures.
Download preprint of Distributed Computing paper:
ps,
ps.gz,
pdf,
pdf.gz.
Download PODC 2004 paper:
ps,
ps.gz,
pdf,
pdf.gz.
Last modified: Sun Apr 23 13:39:19 IDT 2006