Edmond Lau :: March 8, 2005

6.894: Lightweight Formal Methods

Case Study of Transaction Locking

1 Introduction

Database management systems must support multiple clients concurrently executing transactions on stored data. The ability to efficiently handle concurrent transactions is one of the core requirements for high performance. In a simple database management system that handles its transactions serially, there is no risk of transactions simultaneously modifying the same data; in a system that supports concurrency, however, transactions that simultaneously access or modify the same data without any access constraints will likely result in an inconsistent database state. Although the ideas presented in this case study will be discussed against the backdrop of database systems, it is important to realize that the same issues with transactions and locking will apply to any system, such as file systems with multiple clients or operating system threads, that handles concurrent accesses and modifications to shared resources.

To understand the importance of correctly regulating shared data accesses, consider an online flight reservation system. Suppose I were booking an airplane ticket from Boston to San Francisco with a connecting flight through Chicago. If I see that seats are available and choose to buy the ticket, I would not be happy if the system charged my credit card for the flight from Boston to Chicago and then notified me that the last seat in the Chicago to San Francisco flight had just been booked by some other customer right as my credit card had been charged. Constraints on accesses of the shared data, which in this case is the data regarding available seating, must be regulated to prevent this scenario.

To control shared data access, database management systems typically require transactions to acquire locks on the data to prevent other transactions from concurrently modifying the data; the locks are subsequently released after the transaction either aborts or commits all of its changes. In this case study, I present a static Alloy model of transaction locking, illustrate the complications of deadlocks that accompany locking mechanisms, and verify the correctness of a locking protocol known as resource ordering that will prevent deadlocks.

2 Overview of Transactions and Locking

2.1 Transactions

A transaction is a sequence of operations that occur atomically; in other words, the composite nature of the transactions is invisible to other clients of the system. Atomicity implies that 1) either all of the operations in the transaction succeed or they all fail and that 2) no other transactions should be able to view any intermediate results of a partially complete transaction. The first property is known as recoverability, and the second is known as isolation. If all of the operations in a transaction complete successfully, the transaction is said to commit; otherwise, if an operation fails, the transaction is said to abort and may need to roll back any changes that it had previously made. In this case study, I focus on the isolation property of transactions.

Conventional database texts usually use variables as the resource shared among different transactions, so I will follow that convention in this case study. Note that instead of variables, the shared resource could just as easily have been memory, files, blocks, etc. The following problems can arise when transactions concurrently modify the same variable:

Note that the above problems occur because some transaction wants to write to shared data. If the transactions had been read-only, no conflict would have occurred even if the accessed data were shared among multiple transactions.

2.2 Locking

The standard technique for solving concurrency issues employs locks. Prior to reading or writing to a variable X, a transaction must acquire a lock on X, which it releases after executing all of its operations. Other transactions that need to access X must wait until the lock is released before they can acquire it.

Some database systems only offer one type of lock; others distinguish between shared locks and exclusive locks. A transaction acquires a shared lock if it needs to read a variable and an exclusive lock if it needs to write to a variable; this division enables multiple transactions to read a variable concurrently.

The second domain in which locking mechanisms can differ is in lock granularity. At one extreme, the system may only have one system-wide lock that all transactions must acquire prior to accessing data, in which case the system reduces to a serial one. At the other extreme, there may one lock per row in a database table. While the usage of more specific locks increases the number of possible concurrent transactions, it also increases additional computational overhead to manage the greater number of locks and also increases the possibility of deadlock.

Deadlocks, also known as circular waiting, occur when two or more transactions are each waiting for locks to data that they need to access before they can commit, but some other transaction within that group is currently holding a lock on that data. For example, if transaction T1 is holding a lock on X and needs to access Y, but transaction T2 is holding a lock on Y and needs to access X, the two transactions are deadlocked because each one requires a lock that the other is currently holding.

Solutions to deadlocks include implementing system timeouts that cause transactions to abort if they have waited longer than some amount of time or producing dependency graphs that constrain how transactions can obtain locks. In this case study, I analyze a simple protocol known as resource ordering that establishes a total ordering on all resources and requires each transaction to obtain locks on its required resource in increasing order.

Informally, resource ordering can be argued to prevent deadlocks as follows: at each point in time, there will always be a transaction that is holding a lock to a variable whose ordering is greater than that of all other locked variables. This transaction cannot be deadlocked because all the remaining variables it needs to access have a higher ordering than all other locked variables, so those remaining variables must be unlocked. Therefore, the transaction can run to completion, at which point, we can inductively use the same argument with one less transaction.

3 Alloy Model and Commentary

3.1 Alloy Model

I first present my Alloy model in its entirety and then proceed to explain the significance of each sigment with accompanying visualizations. My Alloy model looks as follows:

module transactions/locking

open util/ordering[Variable]

sig Transaction {
	holds: set Lock,
	reads: set Variable,
	writes: set Variable
}
abstract sig Lock {
	guards: some Variable
} {
	-- locks are dynamically generated on a per-request basis and destroyed upon release
	--   so each lock that exists is held by some transaction and there are no unheld locks
	Lock = Transaction.holds
}
sig ExclusiveLock, SharedLock extends Lock {}
sig Variable {}

-- returns set of shared locks held by t
fun sharedLocks(t: Transaction): set Lock {
	SharedLock <: (t.holds)
}
-- returns set of exclusive locks held by t
fun exclusiveLocks(t: Transaction): set Lock {
	ExclusiveLock <: (t.holds)
}

fact lockingRules {
	-- each transaction only holds at most one lock for a given variable
	all t: Transaction, disj lock, lock': t.holds | no lock.guards & lock'.guards

	-- if a transaction holds an exclusive lock on some variable, 
	--    no other transaction can acquire locks on same variable
	all t: Transaction, e: exclusiveLocks(t), v: e.guards |
		no t': Transaction - t | v in t'.holds.guards
}

/*
fact noExcessiveLocking {
	-- don't acquire an exclusive lock on a variable if you don't need to write to it
	all t: Transaction, v: Variable | v not in t.writes => v not in exclusiveLocks(t).guards

	-- don't acquire a lock on a variable if you don't need to read or write to it
	all t: Transaction, v: Variable | v not in t.(reads + writes) => v not in t.holds.guards
}
*/

fact noExcessiveLocking {
	all t: Transaction |
		-- don't acquire a shared lock if you don't need to read it any of the variables it guards
		(all lock: sharedLocks(t) | some lock.guards & t.reads) and
		-- don't acquire an exclusive lock if you don't need to write to any of the variables it guards
		(all lock: exclusiveLocks(t) | some lock.guards & t.writes)
}

pred acquireLocksForReads (t: Transaction) {
	-- for all variables that t reads, it must hold a lock for that variable
	all v: t.reads | some lock: t.holds | v in lock.guards
}
pred acquireExclusiveLocksForWrites (t: Transaction) {
	-- for all variables that t writes, it must hold an exclusive lock for that variable
	all v: t.writes | some lock: exclusiveLocks(t) | v in lock.guards
}
pred showAllRequiredLocksAcquired () {
	-- all transactions have all required locks
	all t: Transaction | acquireLocksForReads(t)
	all t: Transaction | acquireExclusiveLocksForWrites(t)

	-- guide the simulation
	some disj t, t': Transaction | some t.reads and some t'.reads
	some writes
}
run showAllRequiredLocksAcquired

pred waitingForReadLock (t: Transaction) {
	-- some other transaction is holding an exclusive lock on a variable 
	--   that t needs to read
	some v: t.reads |
		v not in t.holds.guards and
		some t': Transaction - t, e: exclusiveLocks(t) | v in e.guards
}
pred waitingForWriteLock(t: Transaction) {
	-- some other transaction is holding a lock (either shared or exclusive) 
	--  on a variable that t needs to write to
	some v: t.writes | 
		v not in exclusiveLocks(t).guards and 
		some t': Transaction - t | v in t'.holds.guards
}
pred waitingForLock(t: Transaction) {
	waitingForReadLock(t) or waitingForWriteLock(t)
}
run waitingForLock for 2

pred deadlock() {
	all t: Transaction | waitingForLock(t)
	some ExclusiveLock
}
run deadlock for 2

pred brokenOrderedLockingProtocol () {
	-- don't lock a variable if there's some lesser ranked variable that:
	--   you need to read and haven't obtained a lock for or
	--   you need to write and haven't obtained an exclusive shared lock for
	all t: Transaction | 
		no v: Variable |
			let alreadyLocked = t.holds.guards |
				(some v': alreadyLocked | lt (v, v') and v in t.(reads - holds.guards)) or
				(some v': alreadyLocked | lt (v, v') and v in t.writes - exclusiveLocks(t).guards)
}
assert brokenOrderedLockingProtocolWorks {
	brokenOrderedLockingProtocol() => !deadlock()
}
check brokenOrderedLockingProtocolWorks for 2


pred orderedLockingProtocol () {
	brokenOrderedLockingProtocol ()
	acquireLockOncePerVariable()
	-- either minimal granularity or minimal locking is required for protocol to work
	minimalGranularity() or minimalLocking()
}
pred acquireLockOncePerVariable() {
	-- if you need to read and write a variable, obtain an exclusive lock and not a shared lock
	all t: Transaction, v: t.(writes & reads) | no lock: sharedLocks(t) | v in lock.guards
}
pred minimalGranularity () {
	-- locks are on the same granularity as variables
	all lock: Lock | lone lock.guards
}
pred minimalLocking () {
	all t: Transaction {
	 	all e: exclusiveLocks(t).guards | e in t.writes
		all s: sharedLocks(t).guards | s in t.reads
	}
}
assert orderedLockingWorks {
	orderedLockingProtocol () => !deadlock()
}
check orderedLockingWorks for 5

3.2 Commentary

For now, I'll skip the open statement and come back to it later.

3.2.1 Signatures and Fields

The first main portion of the model consists of the signatures and fields:

sig Transaction {
	holds: set Lock,
	reads: set Variable,
	writes: set Variable
}
abstract sig Lock {
	guards: some Variable
} {
	-- locks are dynamically generated on a per-request basis and destroyed upon release
	--   so each lock that exists is held by some transaction and there are no unheld locks
	Lock = Transaction.holds
}
sig ExclusiveLock, SharedLock extends Lock {}
sig Variable {}

The reads and writes relations capture the sets of Variables that a particular Transaction needs to read and write, respectively, in order for it to run to completion. The holds relation describes the set of Locks that the Transaction is currently holding. Because the model is static, we can envision any instance as a snapshot of the current state of all transactions, the variables that they need to read and write, and the locks that they currently hold.

Locks are partitioned into two types: ExclusiveLocks and SharedLocks. Each lock guards some set of Variables; because the size of the set of is unspecified, the model supports arbitrary lock granularity. The signature fact appended to Lock models a system where locks are dynamically created when transactions request them, as opposed to having all possible locks for all variables created beforehand; the fact has the added sided effect of eliminating all instances with "floating" locks not held by any transaction.

The following two functions, sharedLocks and exclusiveLocks, that appear after the signature declarations provide two convenience access methods to the shared and exclusive locks held by a transaction; their main purpose is to aid readability of the remaining model.

-- returns set of shared locks held by t
fun sharedLocks(t: Transaction): set Lock {
	SharedLock <: (t.holds)
}
-- returns set of exclusive locks held by t
fun exclusiveLocks(t: Transaction): set Lock {
	ExclusiveLock <: (t.holds)
}

3.2.2 Facts

The fact lockingRules is the first interesting piece of the model. The first constraint limits the number of locks that a transaction can hold for a given variable to one; this constraint seems reasonable because a transaction should have no reason to hold multiple locks on the same variable. The second constraint in the fact explains the semantics of exclusive locks; namely, if one transaction holds an exclusive lock on a variable, no other transaction holds a lock to that same variable.

fact lockingRules {
	-- each transaction only holds at most one lock for a given variable
	all t: Transaction, disj lock, lock': t.holds | no lock.guards & lock'.guards

	-- if a transaction holds an exclusive lock on some variable, 
	--    no other transaction can acquire locks on same variable
	all t: Transaction, e: exclusiveLocks(t), v: e.guards |
		no t': Transaction - t | v in t'.holds.guards
}

The second fact, noExcessiveLocking, expresses some basic intuitions regarding when transactions want to acquire locks. In particular, a transaction should only hold a shared lock for variable X if it needs to read X, and it should only hold an exclusive lock for X if it needs to write to X.

fact noExcessiveLocking {
	all t: Transaction |
		-- don't acquire a shared lock if you don't need to read it any of the variables it guards
		(all lock: sharedLocks(t) | some lock.guards & t.reads) and
		-- don't acquire an exclusive lock if you don't need to write to any of the variables it guards
		(all lock: exclusiveLocks(t) | some lock.guards & t.writes)
}

At this point, I'd like to take a short digression and present an overly restrictive definition of noExcessiveLocking that I had originally wrote, and which is now commented out in the model:

/*
fact noExcessiveLocking {
	-- don't acquire an exclusive lock on a variable if you don't need to write to it
	all t: Transaction, v: Variable | v not in t.writes => v not in exclusiveLocks(t).guards

	-- don't acquire a lock on a variable if you don't need to read or write to it
	all t: Transaction, v: Variable | v not in t.(reads + writes) => v not in t.holds.guards
}
*/

The goal of this original definition was to express the same idea that a transaction should not acquire an exclusive lock on a variable if it did not need to write to it and should not acquire any lock on a variable if it didn't need to read or write to it. However, the two definitions have a subtle distinction. In particular, the definition in the model allows a transaction t to hold a lock on some set of variables if t needs to read or write any subset of those variables, whereas the original definition I wrote required t to need to read or write all of the variables in that set. If we imagine a system where the level of locking granularity was somewhat larger than the variables, say where locks applied to an entire data table instead of individual rows, the original definition would require that the transaction need to read or write every single row in order for it to hold a lock!

3.2.3 Simulations of Successful Transactions

For the first simulation, we take an optimistic approach toward simulation and try to generate some instances in which every transaction successfully commits. To attain this goal, I defined two predicates acquireLocksForReads and acquireLocksExclusiveLocksForWrites that very straightforwardly declare that each transaction is holding all of the locks it needs to respectively read and write its desired variables:

pred acquireLocksForReads (t: Transaction) {
	-- for all variables that t reads, it must hold a lock for that variable
	all v: t.reads | some lock: t.holds | v in lock.guards
}
pred acquireExclusiveLocksForWrites (t: Transaction) {
	-- for all variables that t writes, it must hold an exclusive lock for that variable
	all v: t.writes | some lock: exclusiveLocks(t) | v in lock.guards
}
pred showAllRequiredLocksAcquired () {
	-- all transactions have all required locks
	all t: Transaction | acquireLocksForReads(t)
	all t: Transaction | acquireExclusiveLocksForWrites(t)

	-- guide the simulation
	some disj t, t': Transaction | some t.reads and some t'.reads
	some writes
}
run showAllRequiredLocksAcquired

The showAllRequiredLocksAcquired simulation predicate states that all transactions have all their required locks for reads and writes and has some additional constraints to guide the simulation. Running this predicate generates the following visualization, which I have projected onto Transaction:

The visualization shows that Transaction_0 needs to read Variable2 and has acquired SharedLock0 for it. Transaction_1 needs to read Variable0, write Variable1, and read Variable2; accordingly, it holds a ExclusiveLock0 over Variable0 and Variable1 and SharedLock0 over Variable2. Both transactions therefore hold the requisite locks to read and write their target variables.

3.2.4 Simulations of Waiting Transactions and Deadlock

The next concept I introduce into my model is the concept of a waiting transaction, which is a transaction that either needs to read or write a variable that some other transaction is holding an exclusive lock to. The predicate waitingForReadLock states that transaction t needs to read a variable for which it does not have a lock and for which some other transaction already holds an exclusive lock to; waitingForWriteLock states that t needs to write to a variable for which it does not have an exclusive lock and for which some other transaction already holds an exclusive lock to:

pred waitingForReadLock (t: Transaction) {
	-- some other transaction is holding an exclusive lock on a variable 
	--   that t needs to read
	some v: t.reads |
		v not in t.holds.guards and
		some t': Transaction - t, e: exclusiveLocks(t) | v in e.guards
}
pred waitingForWriteLock(t: Transaction) {
	-- some other transaction is holding a lock (either shared or exclusive) 
	--  on a variable that t needs to write to
	some v: t.writes | 
		v not in exclusiveLocks(t).guards and 
		some t': Transaction - t | v in t'.holds.guards
}
pred waitingForLock(t: Transaction) {
	waitingForReadLock(t) or waitingForWriteLock(t)
}
run waitingForLock

pred deadlock() {
	all t: Transaction | waitingForLock(t)
}
run deadlock for 2

Deadlock occurs when all transactions are waiting. Running deadlock generates the following instance of deadlock:

We observe from the visualization that Transaction_0 holds an exclusive lock to Variable0 and needs to read/write Variable1, but Transaction_1 holds an exclusive lock on Variable1 and needs to read/write Variable0. Thus, we have an instance of deadlock where neither transaction can proceed.

3.2.5 Resource Ordering as a Solution to Deadlock

In the last part of the model, I verify the correctness of using resource ordering for a locking protocol and check that it prevents deadlock. Recall that resource ordering establishes a total ordering on all variables and requires all transactions to acquire locks for their variables in increasing order. To accomodate this ordering, I use the util/ordering module parameterized on Variable to establish a total ordering on variables; this explains the following line at the top of the model:

open util/ordering[Variable]

The locking protocol seemed fairly straightforward to me, and I wrote down the constraints in the following predicate, which I have renamed to brokenOrderedLockingProtocol. The predicate states that for all transactions t, there is no yet-to-be-locked variable v that t needs to read if t already holds a lock for some variable v' earlier in the ordering that it needs. It also states that there is no yet-to-be-exclusively-locked variable v that t needs to write if t already holds a lock for some variable v' earlier in the ordering that it needs.

pred brokenOrderedLockingProtocol () {
	-- don't lock a variable if there's some lesser ranked variable that:
	--   you need to read and haven't obtained a lock for or
	--   you need to write and haven't obtained an exclusive shared lock for
	all t: Transaction | 
		no v: Variable |
			let alreadyLocked = t.holds.guards |
				(some v': alreadyLocked | lt (v, v') and v in t.(reads - holds.guards)) or
				(some v': alreadyLocked | lt (v, v') and v in t.writes - exclusiveLocks(t).guards)
}
assert brokenOrderedLockingProtocolWorks {
	brokenOrderedLockingProtocol() => !deadlock()
}

However, as suggested by the name, the following protocol actually does not work, and generates the following counterexample:

The additional next_ relations show the ordering created by util/ordering. In this example, Variable0 comes before Variable1 in the total ordering.

From the visualization, we see that Transaction_0 holds an exclusive lock on Variable0 and a shared lock on Variable1, which means that it has not violated the resource ordering constraint. It is waiting to acquire an exclusive write lock to Variable1. Transaction_1 holds a shared lock on Variable1 (the only variable it needs to access) and is waiting to acquire an exclusive write lock to Variable1 as well. What went wrong with our simple protocol? It seems that the problem stems from the fact that both transactions need to read and write Variable1, but both have only secured shared locks for reading the variable, which at the same time prevents either transaction from proceeding. Thus, perhaps if a transaction needs to both read and write a variable, it should not acquire a shared lock for that variable. In a dynamic situation, however, where the set of variables the transaction reads and writes is not predetermined, the transaction may only need to read the variable initially and then later decide to write to it.

To capture the aforementioned idea, I wrote the following acquireLockOncePerVariable predicate:

pred acquireLockOncePerVariable() {
	-- if you need to read and write a variable, obtain an exclusive lock and not a shared lock
	all t: Transaction, v: t.(writes & reads) | no lock: sharedLocks(t) | v in lock.guards
}

However, conjoining this predicate with brokenOrderedLockingProtocol still generated a counterexample to the assertion that deadlock would be prevented:

The problem now is that when acquiring a shared lock to read Variable1, Transaction_0 and Transaction_1 both acquired a lock that was too coarse and that also guarded Variable2. Neither transaction is then able to obtain an exclusive lock to Variable2, which both need to write to. I found two remedies to this problem, both of which add an additional constraint on how transactions obtain locks: the first, was to make the locks more specific and capable of guarding only one variable each; the second was to forbid a transaction from ever obtaining an exclusive lock on a variable that it did not need to write or a shared lock on a variable that it didn't need to read. These fixes are presented as minimalGranularity and minimalLocking below:

pred orderedLockingProtocol () {
	brokenOrderedLockingProtocol ()
	acquireLockOncePerVariable()
	-- either minimal granularity or minimal locking is required for protocol to work
	minimalGranularity() or minimalLocking()
}
pred acquireLockOncePerVariable() {
	-- if you need to read and write a variable, obtain an exclusive lock and not a shared lock
	all t: Transaction, v: t.(writes & reads) | no lock: sharedLocks(t) | v in lock.guards
}
pred minimalGranularity () {
	-- locks are on the same granularity as variables
	all lock: Lock | lone lock.guards
}
pred minimalLocking () {
	all t: Transaction {
	 	all e: exclusiveLocks(t).guards | e in t.writes
		all s: sharedLocks(t).guards | s in t.reads
	}
}
assert orderedLockingWorks {
	orderedLockingProtocol () => !deadlock()
}
check orderedLockingWorks for 5

The fixed orderedLockingWorks now mends the broken protocol by constraining transactions to only acquire locks once per variable and by either enforcing minimal granularity of locks or minimal locking by transactions. Checking orderedLockingWorks for a scope of 5 finds no counterexample, which gives us reasonable confidence, that deadlocks have been prevented.

4 Conclusion

Constructing an Alloy model turned out to be extremely helpful in modelling transaction locking. Formally expressing concepts such as transactions and locks in Alloy signatures and fields generated design issues such as lock granularity and exclusivity. Model finding and guided simulation enabled a very fluid incremental construction of the model.

More importantly, the automatic analysis showed some properties of the locking protocol based on resource ordering that I had not previously known. From the informal argument that I presented at the beginning of the case study, the correctness of resource ordering had seemed relatively apparent. However, the analysis revealed notions that I had not considered seriously until now. Resource ordering prevents deadlocks only insofar as transactions do not acquire shared read locks when they also need to write to the data. In real world situations, where the variables that a transaction reads and writes may be dynamically determined by values that it has read thus far, predicting which variables will be written to is not always possible. Moreover, resource ordering also requires that either locks be more specific or that transactions are careful as to the coarseness of locks that may acquire, both of which may not be possible in a high performance database system that does not support extremely fine-grained locks. Ultimately, this Alloy case study has demonstrated that that resource ordering may not be a complete solution to real-world transaction locking and that timeout mechanisms may be necessary to prevent deadlock.

edmond@mit.edu