Martin Rinard

Data Structure Consistency


Data structure consistency is a core program correctness property. Errors that cause data structure corruptions can cause programs to produce incorrect results, crash, or infinite loop. Motivated by these issues, my research group pioneered two orthogonal data structure consistency techniques: data structure repair, which detects and repairs inconsistent data structures and data structure verification, which formally verifies that data structure implementations satisfy their specifications.

Our data structure repair research produced the first specification-driven data structure repair algorithms and systems. Our data structure verification research produced the first verification of fully functional correctness for linked data structures such as mutable lists, trees, graphs, and hash tables.

Data Structure Repair

Software errors (such as data structure implementation defects or memory errors in other components) and hardware errors can cause data structures in running programs to violate key data structure consistency properties. This violation can cause the program to produce unacceptable results or even fail. Our IEEE TSE 2006 paper presents techniques that repair the data structure, eliminate inconsistencies, and enable the system to continue to execute successfully. We have used our data structure repair system to repair inconsistent data structures and eliminate failures in a variety of applications, including an air-traffic control system, an open-source word processing system, and an interactive multiplayer game.

Earlier versions of this research appeared in Our ICSE 2005 paper, our ISSRE 2003 paper, our OOPSLA 2003 paper, and our SMS 2003 paper. This research produced the first specification-driven data structure repair algorithms and systems.

Formal Verification of Data Structure Consistency

Our data structure verification research pioneered foundational techniques (such as integrated reasoning) for specifying and verifying data structure correctness, including the first verification of fully functional correctness for a range of linked data structures such as mutable lists, trees, graphs, and hash tables. In general, we are interested in two kinds of data structure consistency:

Two problems complicate the realization of this goal. The first problem is scalability - analyses that are capable of verifying our target class of sophisticated consistency requirements simply do not scale to sizable programs and there is little hope that they will ever do so.

The second problem is diversity. There is an enormous range of data structures with very different consistency properties. Consider, for example, a list and a hash table. The list consistency properties involve multiple objects linked together with pointers, while the hash table consistency properties involve arrays and array indices. These different kinds of properties require analyses that focus on very different kinds of basic properties.

We adopt a new perspective on the data structure consistency problem: we propose a technique that developers can use to apply multiple pluggable analyses to the same program, with each analysis applied to the modules for which it is appropriate. The key features of this technique include:

Together, these features enable the focused application of a full range of precise analyses to programs that contain multiple data structures encapsulated in multiple modules. They promote the development of a range of pluggable analyses that developers can deploy as necessary to verify important data structure consistency properties. Abstract sets enable different analyses to communicate and interoperate to verify properties that cross module boundaries to involve multiple data structures. Our approach supports the appropriately verified participation of objects in multiple data structures, including patterns in which objects migrate sequentially through different data structures and patterns in which objects participate in multiple data structures simultaneously.

This research produced the following systems: