Introduction to Program Synthesis

© Armando Solar-Lezama. 2018. All rights reserved.

Lecture 13: Functional Synthesis with Sketch.

In the last lecture we already saw an example where Sketch was used to synthesize a code fragment that we wanted to be correct for all inputs. In sketch, this can be done by adding parameters to the test harness. These parameters will be treated as universally quantified variables, so the goal of the synthesizer will be to solve for a piece of code that works for all values of these inputs within a prescribed domain. In the presence of such universally quantified variables, Sketch uses the CEGIS algorithm described in Lecture 10. In this lecture, we will focus on three aspects of Sketch that relate to the synthesis with universally quantified variables: how to deal with loops and recursion, some important cases where CEGIS fails, and some limitations of the theory reasoning in sketch.

Loops and recursion.

Lecture13:Slide7 In the last lecture, we saw one approach for dealing with loops: generate a verification condition and synthesize loop invariants and variants that allow us to prove total correctness of the synthesized code. This approach is very powerful and has the advantage of generating a provably correct program. However, it also has some important limitations. First, for some programs, the loop invariants and variants may end up being much more complex than the code itself, so forcing the synthesizer to synthesize these in addition to the code itself can make the synthesis problem much harder. For programs manipulating heap based data-structures, or for programs where the correctness condition involve proving the equivalence of two different algorithms, synthesizing all the artifacts necessary for verification will simply be too hard.

The sketch solver follows a different approach: Unrolling loops. The idea is to exploit the following identity: while(b){ if(b){ C C; } is equivalent to while(b){ . C; . } . }

Lecture13:Slide13 Unrolling the loop means applying this equivalence repeatedly. At some point we must stop the conversion and we simply replace the while loop with assert false;. This assertion means that if there is any input that would have caused the loop to iterate more, then that input will cause an assertion failure. This means that the loop unrolling generally has to be accompanied by an assumption on the inputs that will prevent inputs that would have caused the loop to iterate too much. For example, if a program contains a loop of the form: i = 0; while(i < N){ i = i+1; } Then if the loop is being unrolled 5 times, we need an assume that ensures that $N$ cannot be greater than 5. In sketch, there are two ways to impose such bounds. One is to use the assume construct; in this case we could add assume N < 5; right before the loop. Alternatively, one can do this by bounding the number of bits for input values. In sketch, any integer input to the test harness defaults to a 5-bit possitive integer, so it ranges from 0 to 31, but this range can be extended or restricted by using the --bnd-inbits n flag, which restricts the number of bits of integer inputs to n. The unrolling amount itself is controlled by a flag --bnd-unroll-amnt n which determines how many times a given loop will be unrolled.

An obvious question is why replace the last instance of the loop with assert false instead of assume false? If we used assume instead of assert, then any input that would have caused the loop to iterate more than the unroll amount would automatically be ignored, and there would be no need of an extra assumption about the ranges of input values. The reason Sketch does not do this is that if sketch used an assume, then it would be too easy for the synthesizer to solve a sketch by causing its loops to iterate beyond their iteration bound.

There is a similar story with recursion. Recursive functions get inlined into their calling context up to a depth determined by the --bnd-inline-amnt n flag. It is worth noting that both in the case of loops and in the case of recursion, if the number of iterations or the number of recursive calls can be determined at compile time, then the respective flags will be ignored in faver of this known amount. So for example, if a program contains a loop of the form for(int i=0; i<100; ++i), this loop will be unrolled 100 times regardless of the value of the --bnd-unroll-amnt flag.

Beyond CEGIS.

Lecture13:Slide15; Lecture13:Slide16; Lecture13:Slide17; Lecture13:Slide18; Lecture13:Slide19; Lecture13:Slide20; Lecture13:Slide21; Lecture13:Slide22 As mentioned in Lecture 10, cegis can sometimes fail in cases where a single counterexample eliminates only a small number of candidate solutions. One particularly important example where this failure happens was first observed by Jha, Gulwani, Seshia and Tiwari Jha:2010. They were focusing on the problem of assembling a set of components to satisfy some specification. A simple way of doing this in sketch is shown in the figure. It involves replicating each component $N$ times and adding a set of assertions to impose constraints that, for example, each compoent is used only once. A seemingly more efficient approach is to invoke each component only once with a temporary input, and to assert that the output condition is satisfied assuming that the inputs are equal to the corresponding outputs.

This formulation of the problem is seemingly more efficient, since it does not require invoking each of the $N$ components $N$ times, but it is a very bad fit for CEGIS. The problem is that the assumptions are too strong and can be controlled by the unknown parameters, so it is easy for the synthesizer to get a counterexample to work by making the assumptions invalid. For example, consider the concrete case illustrated in the figure, where the components correspond to out=in+10, out=in*2 and out=in+1. Now suppose the specification we want to satisfy is out == (in+10)*2 + 1. Given an incorrect solution that first multiplies times 2, then adds 10, and then adds 1, the checker can produce a counterexample. The problem is that the counterexample includes not just the counterexample input, abut also the temporary values that correspond to this particular input and this particular order of the components. Unfortunately, these temporary values are so specific, that even swapping the +1 and the +10 components (which has no effect on the semantics of the composition) is enough to invalidate the counterexample and cause it to fail the assumptions.

This is a general problem of synthesis problems that have the form \[ \exists \phi . \forall in, temp. P(\phi, in, temp) \Rightarrow Q(\phi, in, temp) \] where the condition $P(\phi, in, temp)$ is very strong and for a given counterexample, it is easier for the synthesizer to invalidate $P$ than to find the $\phi$ that satisfies $Q$. The solution proposed by Jha et al. is to change the form of the constraint to \[ \exists \phi . \forall in. \exists temp. P(\phi, in, temp) \wedge Q(\phi, in, temp) \] In general, the two formulas are not equivalent, the formula on the top requires that $Q$ is valid for all $temp$ that satisfy $P$, whereas the one on the bottom only requires there to exist some value of $temp$ that satisfies $P$ and $Q$. On the other hand, the formula on the bottom demands the existence of a $temp$ that satisfies $P$, whereas the formula on the top can be satisfied if no such $temp$ exists. However, the two formulas are equivalent if $P(\phi, in, temp)$ is satisfied by one and only one value of $temp$ for any $\phi, in$ pair.

What do we gain from adding additional quantifier alternation? In the context of CEGIS, this extra quantification has the benefit that the counterexample no longer has to include the temporary values. For the synthesis phase, the counterexample can be limited to $in$. For example, when we have 2 counterexample inputs $in_1, in_2$, the constraint on $\phi$ would now be. \[ \exists \phi . \exists temp_1, temp_2. P(\phi, in_1, temp_1) \wedge Q(\phi, in_1, temp_1) \wedge P(\phi, in_2, temp_2) \wedge Q(\phi, in_2, temp_2) \] Note that there is no additional quantifier alternation because the universal quantifier is expanded out into the individual examples. This basically means that the synthesizer is now responsible for coming up with the temporary values that match the counterexample and its choice of control values. The extra quantifier alternation would cause problems for the verification phase, but for the verification phase, it is easy just to use the original form of the constraint.

The same idea is used by sketch as part of its model functionalitySinghSXKS14. To explain this idea, it is important to first understand the notion of uninterpreted functions. An uninterpreted function is a function for which we know nothing about it other than the fact that it is a function, meaning that when given the same input it will produce the same output. These are very useful in program verification when we need to verify the equivalence of two pieces of code that both use the same complex routine, but we do not want to have to reason about the details of the routine. For example, if we know that $f$ is a function, we can prove that $f(3+x + 2)$ is equivalent to $f(x + 5)$ without having to know anything about $f$ other than the fact that it is a function that will produce the same output given the same integer input.

In some cases, uninterpreted functions can be too restrictive, we want to be able to state additional properties on the output of a function. These uninterpreted functions with extra properties are often called partially interpreted functions. Sketch allows you to model the behavior of a complex function through an uninterpreted function with additional properties. For example, we can model the behavior of a square root function with the model shown below. model int msqrt(int i){ int rv = sqrtuf(i); if(i==0){ assert rv == 0; } assert rv*rv <= i; assert (rv+1)*(rv+1) > i; return rv; } The internal call to sqrtuf is to an uninterpreted function. The assertions that come after that call relate the return value to the inputs. In the case of this example, they ensure that the value is the integer square root of the input. The assertions in the model act exactly like the predicate $P$ in the general formula above and could lead to the exact same problems. In order to avoid this, Sketch uses the approach outlined earlier to ensure that CEGIS works properly.

The approach used by sketch to cope with these models is slightly more complex than what has been described here, but the high-level idea is the same. Students who want to learn more are encouraged to read the original paper by Singh et alSinghSXKS14.

Limitations of theory reasoning.

In sketch, the inputs to a harness are limited to integers, booleans (type bit in Sketch) and arrays of them. Additionally, the Sketch can invoke uninterpreted function that produce any of these types as output. However, as has been alluded already, the verification phase of Sketch will not actually check for all possible values of these inputs, only for values within a particular bound. The flag --bnd-inbits n determines how many bits the inputs are allowed to have, and in the case of variable length arrays, the additional flag --bnd-arr-size n limits the maximum size of these arrays (default is 32). It is also important to note that sketch only considers non-negative values for any of its integer inputs.

In addition to the bounds on the input sizes, it is worth recalling that by default, sketch uses a one-hot representation for all internal integer values. This can lead to situations where the internal representation of an integer grows to a very large size despite the fact that when executing the program, all integers could be expected to stay within a given range. This problem will manifest itself as sketch running out of memory, usually during the synthesis phase. One way to deal with this problem is to use the --bnd-int-range k flag. This tells sketch that for all inputs in range, we do not expect any value anywhere in the computation to be outside the range $[-k, k]$. Sketch then uses this information to keep the representations to grow beyond $2*k$ in size. Alternatively, sketch also has a native integer solver that can be enabled with the flag --slv-nativeints. This solver does not rely on a one-hot encoding, and therefore scales better for problems where larger integers are expected.