Lecture 4: Top Down and Type directed Explicit Search.
Bottom up search strategies start by constructing small program fragments and then putting together progressively larger fragments until a complete program is constructed. By contrast, Top-down search work by constructing programs with holes, and progressively filling in the details of these holes. This strategy has proven to be particularly effective in constructing functional programs. Top-down explicit search is also the first context where we encounter another powerful idea in synthesis: the use of types in order to prune the search space. Up to this point, we have been relying on a grammar to define the space of legal programs and have assumed that any legal program with respect to the grammar is valid. Types, however, provide us with an additional mechanism for ruling out invalid programs. Because type systems are generally designed to support local checking, types allow us to rule out invalid program fragments quickly, so we never have to waste time trying different programs. The idea of leveraging the type system to aggressively prune the search space was proposed almost simultaneously by the team of Osera and ZdancewicOsera:2015 and by Feser, Chaudhuri and Dillig Feser:2015 at PLDI 2015. In this lecture we focus on the approach of Feser et al., who also expanded on the idea of type directed pruning with some additional pruning strategies based on deductive rules. Before we can explore explore the results of these and other papers, we are going to introduce the simple functional language we will be using for the rest of the section.A simple language for list manipulation
In order to explain this algorithm, we are going to be using a simple functional language for manipulating lists. The language will be given by the following grammar:
$
\begin{array}{rcl}
expr &=& var \\
&|& \lambda x. expr\\
&|& \mbox{filter } expr ~ expr \\
&|& \mbox{map } expr ~ expr \\
&|& \mbox{foldl } expr ~ expr ~ expr \\
&|& boolExpr ~|~ arithExpr \\
\end{array}
$
The symbol $var$ can represent any variable currently
in scope. We assume the variable $x$ in the lambda
construct is just a fresh variable that does not appear
anywhere else in the expression. We assume that
the boolean and arithmetic expressions are standard
boolean and arithmetic expressions as defined earlier
in other languages.
The expressions filter, map and foldl are defined below
using functional programming notation.
Running example
To illustrate the main ideas of the algorithm, consider the following example from Feser et al. The goal is to define a functiondropmins
that
takes as input a list of lists of integers, where each one of the inner lists
corresponds to a list of grades. The output of dropmins
must be a new list of lists where the lowest grade has been dropped.
For example, below is an input/output pair for this function:
Basic top down search
in
.
Evaluating this program, it is clear that it will not produce
the desired output, so it can be clearly discarded.
After this first expansion, all the other programs involve un-expanded
expressions, so they cannot be evaluated. Even without evaluating them,
however, we can already determine that many of them cannot possibly
be expanded into a program that works. To understand why, we need
to understand more about the type system for this simple language.
A simple type system
Not all programs in the language above are valid programs. This is because expressions in this language actually have types. We have seen languages with types before. For example, the list language from Lecture 2 distinguished between two types: integer and List. Because we only had two types, it made sense to simply distinguish between these in the grammar by separating integer expressions from list expressions. This langauge is different, though, because we have a potentially infinite set of types! The reason it is infinite is that we want to support not just integers and lists of integers, but also arbitrarily nested lists of integers, as well as functions.
$
\begin{array}{rcl}
\tau &=& Int ~ | ~ Bool\\
&|& [\tau] \\
&|& \tau \rightarrow \tau \\
\end{array}
$
foldl
was used in the example earlier,
$\tau_{start}$ was actually equal to $Bool$ (because the second parameter was
the boolean expression False
), and $\tau_{lst}$ was actually
equal to $Int$ because the third parameter was a list of integers. The
function
(λ t. λ w. t or (w < z))
that was passed as a first parameter therefore had type
$\tau_{start} \rightarrow \tau_{lst} \rightarrow \tau_{start} =
Bool \rightarrow Int \rightarrow Bool
$
Pruning the search with types
λ x. expr
will be of type $\tau_1 \rightarrow \tau_2$. Different expressions
$expr$ may lead to different types $\tau_1$ and $\tau_2$, but a lambda
must always be a function type. This means that no matter what expression
we use for $expr$, $\lambda x. expr$ will never have the type we need
for the output, and can therefore be safely discarded.
The same is true of the integer and boolean expressions. Regardless of how
we instantiate them, they will always have type $Int$ and $Bool$ respectively,
so they can never have the desired $[[Int]]$ type, so they can be safely discarded
as well.
An expression such as map expr expr
can have the desired type,
but note from the typing rule that in order to have the desired type,
the first expression must correspond to a function with type
$\tau_1 \rightarrow [Int]$. This imposes some strong constraints on the
next level of the search, because we can again rule out any expression that
does not have the desired type even before having concrete values for any
of its subexpressions.
Further pruning with deductive rules
map in λx. expr
. From the definition of map,
we can derive input/output rules for the individual expression expr
,
because we know that every element in the input list will be processed by
this expression and its output will be added to the output list.
Therefore, when searching for the expression, we can directly check it
against its own set of local input/output pairs as illustrated in the figure.
We could further hypothesize that the λx. expr
,
is actually λx. map x λy. expr
, but even
without knowing what expr
is, we can see that this is not
going to work, because the output of map must be of the same length as its input.
On the other hand, if we hypothesize λx. expr
is
actually λx. filter (λy. expr) x
, we can
again propagate the input/output example to the individual expression
based on our knowledge of how filter works.