Extensible Data-Representation Selection for Correct-by-Construction Program Derivation

Benjamin Delaware, Sorawit Suriyakarn, Adam Chlipala


Program synthesis via refinement is a venerable approach for gradually transforming specifications into executable code, generating a proof trail showing that the final efficient program adheres to the specification. We present the first automated, proof-generating refinement system that invents new data structures to suit the needs of a program, applying global program analysis to understand those needs. Our system, based on the Fiat library for the Coq proof assistant, is extensible not just with new data structures but also with new rules for inventing more complex data structures by combining simpler ones. All data-structure operations come with semantic interfaces capturing their behavior in terms of abstract mathematical objects such as sets, and the system soundly and automatically tiles the specification with those operations to generate an OCaml-style program and a proof of its correctness. Starting from a high-level specification in extensible SQL-like language using logic to describe arbitrary operations on data, without explaining how to execute them, our system automatically matches those operations to proof-generating data-structure-construction rules in our library to build an efficient, correct-by-construction implementation.