Increasing the Resilience of Atomic Commit, at No Additional Cost.

Authors: Idit Keidar and Danny Dolev.

In the 1995 ACM-SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems (PODS), May 1995, pages 245--254.

The full version of this paper will appear in the Journal of Computer and System Sciences (JCSS) special issue with selected papers from PODS 1995. Please see: full paper.

Abstract:

This paper presents a new atomic commitment protocol, Enhanced Three Phase Commit} (E3PC), that always allows a quorum in the system to make progress. Previously suggested quorum-based protocols (e.g the quorum-based Three Phase Commit (3PC)) allow a quorum to make progress in case of one failure. If failures cascade, however, and the quorum in the system is ``lost'' (i.e. at a given time no quorum component exists, e.g. because of a total crash), a quorum can later become connected and still remain blocked. With our protocol, a connected quorum never blocks. E3PC is based on the quorum-based 3PC, and it does not require more time or communication than 3PC. The principles demonstrated in this paper can be used to increase the resilience of a variety of distributed services, e.g. replicated database systems, by ensuring that a quorum will always be able to make progress.

Postscript Version: ps, ps.gz.


Last modified: Mon Jul 1 14:35:39 EDT 2002