Introduction to Program Synthesis

© Armando Solar-Lezama. 2018. All rights reserved.

Lecture 1: Introduction and definitions

The dream of automating software development has been present from the early days of the computer age. Already back in 1945, as part of his vision for the Automatic Computing Engine, Alan Turing argued that

Instruction tables will have to be made up by mathematicians with computing experience and perhaps a certain puzzle-solving ability… This process of constructing instruction tables should be very fascinating. There need be no real danger of it ever becoming a drudge, for any processes that are quite mechanical may be turned over to the machine itself. copeland2012alan

Traditionally, the way automation was incorporated into software development was through the use of compilers and high-level languages. When the first FORTRAN compiler was developed, it was touted as "The FORTRAN Automatic Coding System", it's goal was nothing less than to allow the IBM 704 to code problems for itself and produce as good programs as human coders (but without the errors) Backus:1957 .

Compilation and synthesis are very closely related in terms of their goals, they both aim to support the generation of software from a high-level description of its behavior. In general, though, we expect a synthesizer to do more than translate a program from one notation to another as traditional compilers do; we expect it to discover how to perform the desired task. The line can be blurry since some aggressive optimizing compilers can be argued to actually discover how to perform a computation that was specified at a high-level of abstraction (parallelizing compilers are a good example). One distinguishing feature between a compiler and a synthesizer is the element of search. In a compiler, an input description of the computation is transformed into a program by applying transformation rules according to a pre-defined schedule. By contrast, a synthesizer is generally understood to involve a search for the program that satisfies the stated requirements. Again, the line is blurry, because several modern research compilers aggressively search the space of transformations to find optimal implementations in a process known as Autotuning.

Another class of techniques that is closely associated with synthesis is declarative programming, and in particular logic programming. The dream of logic programming was that programmers would be able to express the requirements of their computation in a logical form, and when given an input, the runtime system would derive an output that satisfies the logical constraints through a combination of search and deduction. So the goals are also closely related to program synthesis, but there are some important distinctions. First, rather than trying to discover an algorithm to solve a particular problem, logic programming systems rely on a generic algorithm to search for a solution to every problem. This means that for many problems, they can be dramatically slower than a specialized program to solve a particular task. Additionally, if the problem is under-specified, the user may get a solution at runtime that is very far from the solution that was expected for the program.

Finally, Machine Learning corresponds to a third class of approaches that are closely related to program synthesis. The canonical problem in Machine Learning is finding a function whose behavior closely matches a given dataset. So machine learning problems can be thought of as a special case of program synthesis problems where the specification comes in the form of data. The biggest distinction between program synthesis and Machine Learning is that in Machine Learning, the space of functions that the algorithm considers is very tightly prescribed. For example, linear classifiers, decision trees and neural networks are some examples of classes of functions that have been very well studied, and each of these classes has its own specialized set of algorithms for deriving a function that matches a dataset. By contrast, in program synthesis we are interested in general algorithms that can work with more general classes of programs, with a particular interest in programs that support recursion or other forms of iteration. Traditionally, there was a second important distinction in that program synthesis generally aspired to discovering programs that precisely matched the specification, whereas in machine learning the notion of learning from noisy data is deeply ingrained in all algorithms. This distinction is less relevant today since there is strong interest in the synthesis community on algorithms that are robust to noise, or that behave well in the presence of incomplete specifications.

Note that thinking of Machine learning as a form of program synthesis is different from the more recent trend of using Machine learning to support general program synthesis. As we will learn in Unit 3, machine learning in general and language models in particular have come to play a prominent role in program synthesis as a way of guiding the search for programs and in helping to support more informal means of specification.

Program Synthesis

So if program synthesis is not compilation, it is not logic programming, and it is not machine learning, then what is program synthesis? As mentioned before, different people in the community have different working definitions of what they would describe as program synthesis, but I believe the definition below is one that both captures most of what today we understand as program synthesis and also excludes some of the aforementioned classes of approaches.
Program Synthesis correspond to a class of techniques that are able to generate a program from a collection of artifacts that establish semantic and syntactic requirements for the generated code.
There are two elements of this definition that are important. The first is an emphasis on the generation of a program; we expect the synthesizer to produce code that solves our problem, as opposed to relying on extensive search at runtime to find a solution for a particular input, as logic programming systems do. The second is the emphasis on supporting specification of both semantic and syntactic requirements. We expect synthesis algorithms to provide us with some control over the space of programs that are going to be considered, not just their intended behavior. It is important to emphasize that individual synthesis systems may not themselves provide this flexibility; in fact, the biggest successes of synthesis so far have been in specialized domains where constraints on the space of programs have been "baked in" to the synthesis system. Nevertheless, even if the flexibility is not exposed to the users, the underlying synthesis algorithms do have significant flexibility in how the space of programs is defined, and this is a big differentiator both with respect to compilation and with respect to machine learning. In general, both of these requirements imply that our synthesis procedures will rely on some form of search, although the success of synthesis will be largely driven by our ability to avoid having to exhaustively search the exponentially large space of programs that arise for even relatively simple synthesis problems.

Program Synthesis Today

If you have heard of program synthesis recently, it has probably been in the context of tools such as co-pilot, which provide a form of auto-complete on steroids that can help programmers complete even large chunks of code from specifications written as comments, or in some cases just from the context of the code you have already written. Or perhaps, you have heard of AlphaCode, a system that produced code from natural language specifications and input/output tests and was able to place above the 50th percentile in a programming competition. Both of these tools use Large Language Models (LLMs) based on Transformers in order to generate code from high-level specifications.

But there is more to program synthesis than LLMs. As powerful as they are, LLMs also have some important limitations: they cannot directly reason about the correctness of the code they produce, and the most powerful LLMs require large amounts of data and infrastructure to train. LLMs are also not a good fit for problems that require large amounts of exploration. So despite the success of LLMs, program synthesis remains an active area of research with research papers being published every year in all the major programming systems conferences (PLDI, POPL, OOPSLA), as well as in formal methods (CAV, TACAS) and machine learning (NeurIPS, ICLR, ICML).

Before the advent of LLMs, the focus of the field was on efficient search techniques that could explore large spaces of possible programs to find one that satisfied a set of requirements. Those techniques were limited to synthesizing fairly small programs, and could not take advantage of unstructured means of specification such as natural language. Despite these limitations, these techniques achieved some impressive results. For example, early success stories included the ability to synthesize Karatsuba big-integer multiplicationsketchthesis, Strassen's matrix-multiplicationSrivastava:2010, or the functional cartesian product algorithm of Barron and C. Strachey, which is considered the first functional pearlFeser:2015. The search-based techniques proved to be very effective for things like bit-vector manipulations. The winner of a program synthesis competition back in 2019 was able to synthesize every bit-vector manipulation that the organizers threw at it. Search-based techniques were also designed to work well with verification, enabling the synthesis of provably correct implementations of fairly complex algorithms; in a few years, the field was able to move from things like sorting and list reversal to algorithms and data-structure manipulations such as insertion into red-black trees or binary heapsPolikarpova:2016.

Despite the advent of LLMs, the search-based techniques continue to be relevant because they have a number of advantages over LLM-based approaches. First, these approaches can be engineered to be extremely efficient, and can be made effective even without the need to collect large amounts of training data, which makes them useful for specialized applications for which training data is unavailable.

Program Synthesis Applications

One of the most obvious uses of program synthesis is as a software engineering aid. This is the application where tools such as co-pilot have excelled, helping developers writing production level code do so more effectively. Those tools arose of earlier works in the program synthesis community that aimed to leverage machine learning to help developers to use complex APIs MuraliCJ17a. Many of the earlier program synthesis techniques were also envisioned as an aid to programmers working on especially challenging pieces of code to develop them correctly. However, there are other applications of program synthesis that have proven to be even more useful.

Lecture1:Slide14;Lecture1:Slide15 One important application has been in support of end-user or non-expert programming. Here, the idea is to help people dealing with small scale programming tasks which they may not even recognize as small scale programming tasks. For example, one area where program synthesis has been particularly succesful is in the area of "Data Wrangling", the manipulation of data, especially by people with no prior programming experience. The first commercial application of program synthesis was FlashFillgulwani:2011:flashfill, a feature that was first incorporated into Excell 2013, which allows users to do data manipulation by providing a few examples and automatically derives a small program from the examples and applies the program to the rest of your data. On the research side, there have been significant advances in the ability to synthesize fairly complex database queries either from examplesWang:2017, or from natural language queriesYaghmazadeh:2017. This area has proven to be a good fit for our current synthesis capabilities because on the one hand, there is a strong need to make data analysis and cleaning accessible to non-programmers, and on the other hand, the programs in question are generally small and easy to describe through examples or other forms of natural interaction.

Lecture1:Slide27; Lecture1:Slide28; Lecture1:Slide29; Lecture1:Slide33; Another area that has seen major interest is the reverse engineering of code. Traditionally, we think of synthesis as starting with a specification and generating an implementation from that. But in this case, that paradigm is flipped on its head: starting from an implementation, the goal is to infer a specification that characterizes the behavior of the given implementation. The idea was first proposed by Susmit Jha, Sumit Gulwani, Sanjit Seshia and Ashish TiwariJha:2010. It has most recently been popularized by Alvin Cheung in an approach known as Verified Lifting, where the goal is to discover a high-level representation that is provably equivalent to an implementation and that can be used to generate a more efficient version of the code. The idea was first applied to the problem of generating SQL queries that are equivalent to a piece of imperative code, but has most recently been applied to a variety of problems ranging from modernizing legacy HPC applicationsKamil:2016, to optimizing Map Reduce programs Ahmad:2018.

Another example of the use of synthesis for reverse engineering involves the creation of models of complex code for the purpose of program analysis. For example, a recent paper by Jinseong Jeon et al. Jeon:2016 showed that it was possible to use synthesis to create models of complex reactive frameworks such as the Android and Java Swing by recording traces of the interaction between the framework and a test application and then forcing the synthesizer to produce a model that conforms to that trace and that follows known design patterns. A similar idea was used by Heule, Sridharan and Chandra to synthesize models of array manipulation routines in JavaScript Heule:2015.

Lecture1:Slide41; Lecture1:Slide43; Lecture1:Slide44 One particularly interesting research thrust is the application of synthesis techniques for problems that seemingly have nothing to do with automatic programming; there is a growing realization that program synthesis techniques can actually be applied in a number of domains that have traditionally been thought of as AI. For example, back in 2013, we demonstrated that it was possible to apply program synthesis to the problem of providing feedback for programming assignmentsSingh:2013; similar ideas have been used to other forms of automated tutoring, from teaching automata theoryD'antoni:2015 to teaching deduction AhmedGK13.

More broadly, one of the more exciting research directions around program synthesis is its use as a form of interpretable machine learning. For example, early work showed that it was possible to use program synthesis to do a form of unsupervised learning to learn visual conceptsEllisST15, and more recent work showed that it could be used to understand language morpho-phonologyEllis22Linguistics. Combinations of program synthesis with deep learning have also proven very effective for a variety of tasks ranging from controlpmlr-v80-verma18a, to behavior modeling ShahAdmissibleHeuristics2020. These directions will be explored in more detail in Unit 3 of this course.

Challenges

In general, there are three major challenges one has to address when working with program synthesis. In a recent paper, we refer to these as the Three Pillars of machine programmingGottschlichSTCR18.

Intention. The first challenge is what we have termed the Intention challenge: how do the users tell you their goals? The definition of synthesis talks about semantic and syntactic constraints, but the exact form of these will influence all subsequent decisions about the synthesis system. The success of the FlashFillgulwani:2011:flashfill system has popularized the use of input-output examples as a means of specification, but input-output examples are not suitable for every task. In our own work on storyboard programming, we advocated for an approach to multi-modal synthesis, where concrete examples were combined with abstract examples and logical specifications, so that together they provided enough information about the intended behavior to produce a working implementation of a data-structure manipulationSinghS12.

One big aspect of the intention challenge is how to cope with under-specification. If there are multiple programs that satisfy the requirements, how can we tell which one the user actually wants? Of course one solution is to simply ignore this problem, if the user provides a partial specification, they have no right to complain if they get a different program from the one they wanted. In practice, though, making a good choice can make the difference between a system that is useful, and one that is not.

Invention. The second challenge once we know what the user wants is to actually discover a piece of code that will satisfy those requirements. Arguably this is the central challenge of synthesis, as it potentially involves inventing new algorithmic solutions to a problem. One of the key questions we will be dealing with in this course are different techniques that the community has developed to tackle the inherent complexity of this task.

Adaptation The canonical view of synthesis is that the user is creating a brand new algorithm from scratch, and wants to leverage a synthesizer to create a correct implementation of the desired algorithm. However, most software development involves working in the context of existing software systems, fixing bugs, optimizing code, and performing other kinds of maintenance tasks. This pillar deals with the question of synthesis in a broader context, and the application of synthesis ideas to broader software development tasks beyond green-field software creation. As we will see later, there are a number of compelling applications of program synthesis in support of the broader software development process.