Introduction to Program Synthesis

© Armando Solar-Lezama. 2018. All rights reserved.

Assignment 2

Submission Instructions This assignment is due at 5pm on the date indicated in the class calendar. In order to submit your code, pack each requested file into a tar file and submit through Canvas (we will create a Canvas assignment as the deadline approaches).

Collaboration policy You are allowed to consult with at most one other student while working on the pset, but every student is responsible for writing their own code and submitting their own work. If you do consult with another student, you should explicitly acknowledge them in your submission (you can do that in a comment in question1a.sk).

Problem 1 (30 pts)

For this problem, you will be encoding an arithmetic language in Sketch to synthesize expressions matching input output examples. You should download the latest version of sketch from here. Note: unless you run into issues, do not follow the build instructions. Just unzip the file, and then run the binary located at sketch-frontend/sketch. On some operating systems (e.g. one TA's M1 Mac) you may need to follow the build instructions. To test that your installation is working, you may find it useful to ensure that the first example sketch (doubleSketch) from the manual works.

Consider the following simple arithmetic language:

Int := | n.................................// numeric constant | x.................................// variable | Plus(int, int):int................// addition | Times(int, int):int...............// multiplication | ITE(bool, int, int):int...........// if-then-else Bool := | false.............................// constant false | true..............................// constant true | Lt(int, int):bool.................// less than | Not(bool):bool....................// boolean negation | And(bool, bool):bool..............// boolean conjunction

Problem 1.a

Encode the arithmetic grammar from above as a generator in Sketch, using it to synthesize functions solving the following examples: x=5, y=5, out=15; x=8, y=3, out=14; x=1234, y=227, out=1688; And x=10, y=7, out=17 x=4, y=7, out=-7 x=10, y=3, out=13 x=1, y=-7, out=-6 x=1, y=8, out=-8 Starter code for this problem can be found here. You will find the sketch language reference useful. Submit your code as question1a.sk.

Problem 1.b

In a copy of the file, add the following constraints: Submit your code as question1b.sk.

Problem 2 (40 pts)

Top-down searches (Lecture 4) are widely used technique in program synthesis as they are simple and general, not requiring the context-independent equivalence property relied on by bottom-up approaches. Top-down enumeration exhaustively explores a space of programs in a grammar by repeatedly expanding holes in the current partially-constructed program using production rules from the grammar. A key part of efficient top-down enumeration is the ability to perform these expansions by mutating the program in-place, rather than making a copy of the entire program at each step. When the search hits a dead end (e.g. a size limit is reached, or an incorrect program is found) we require the ability to backtrack by undoing the expansions ("unexpanding") in-place. This problem will guide you through implementing expansion and unexpansion, then using them to implement the overall top-down search algorithm.

Skeleton code for this problem can be found here.

First, move to the q2-3/ directory and run the following command to install the necessary packages:

julia --project -e 'using Pkg; Pkg.instantiate()' Then run the following command to make sure everything is set up properly: julia --project test/internal.jl Notes:

Problem 2.a

First, write implementations of the following functions in src/program_search/expansion.jl. Refer to the definitions and docstrings of the expression type struct Exp and the production rule type struct Production to understand their structure.

You can test your implementation with the following command: julia --project test/q2a.jl

Problem 2.b

Implement expand and unexpand in src/program_search/top_down.jl. These functions should call expand_inner and unexpand_inner, and modifying the search state (struct TopDownSearch) fields .e, .weight, and .holes appropriately. See the definitions of struct TopDownSearch and struct SubExp for the expected structure. You can test your implementation with the following command: julia --project test/q2b.jl

Problem 2.c

Implement top_down_search in src/program_search/top_down.jl. This function should perform a depth-first top-down enumerative search, and will call expand and unexpand. The search should be bounded in a few ways, configured by search.config (see struct TopDownConfig). The last condition might seem odd, but it's relevant because your top-down search code will be executed by the testing code within a larger iterative deepening search, which will first attempt to find all programs in the weight range [0, 1.0], then [1.0, 2.0], etc. The last condition avoids wasting effort re-evaluating solutions that were encountered in earlier iterations of the iterative deepening search.

The pseudocode for the top_down_search function you will implement is given below:

You can test your implementation with the following command:

julia --project test/q2c.jl

Problem 2.d

In this part we'll explore the value of types in constraining the size of the search space.

Modify top_down_search so that if config.disable_types is true, it doesn't check that the production type matches the hole type.

To measure the size of the search space more easily, we've created a test that doesn't actually execute programs, and just reports the number of expansions and number of complete programs encountered at each layer of the iterative deepening search. Run it with the following command: julia --project test/q2d.jl This will run the search with and without type-based pruning - you should get a dramatic difference in the number of expansions and completions.

Problem 3 (30 pts)

Note on dependencies with Problem 2: Part 3(b-d) rely on having finished 2(b); part 3(c) is similar to 2(c) so doing 2(c) first is recommended; and the ungraded final DreamCoder test at the end of Problem 3 relies on having finished 2(c).

This problem builds on the previous problem, exploring component discovery (Lecture 9) in the same graphics domain, which is inspired by a domain from the DreamCoder paper. In Part (a), you'll be using an off-the-shelf component discovery tool (Stitch) to discover components in the graphics domain. Stitch approaches component discovery through a top-down search over the space of components, made efficient through a branch-and-bound strategy. In Part (b-d), you'll be implementing a component discovery algorithm that incorporates some of the core ideas of Stitch.

Problem 3.a

Install the Python bindings for Stitch from here.

Modify q3a.py to call Stitch on the ground truth programs in data/graphics.json. Run Stitch for 3 iterations with a max arity of 3. Print out the 3 components (abstractions) along with the rewritten programs. Include the output of your code in the placeholder comment at the bottom of the file.

What do you think each of the components do? Add a comment very briefly explaining each component. It can be helpful to look at which rewritten programs the components are used in.

Problem 3.b

As discussed in Lecture 9, at each step of search Stitch expands a partial component with a production from the grammar. For example, we might expand the rightmost hole of (+ ?int ?int) with the production (* ?int ?int) to get (+ ?int (* ?int ?int)).

When expanding a hole with a production, the new partial component will match at a subset of the locations it matched at before. This subset property is critical to the efficiency of the search: to get our new set of matches after an expansion, we can simply filter the original set of matches to only keep the ones where the chosen production ((* ?int ?int)) appears at the expansion location.

Implement subset_matches_to_production in component_search/branch_and_bound.jl. This function performs the subsetting operation described above, setting search.matches to a (new) filtered list of matches - but doesn't mutate the input list of matches nor the original search.matches. A few points that may be helpful: You can test your implementation with the following command: julia --project test/q3b.jl

Problem 3.c

Implement branch_and_bound in component_search/branch_and_bound.jl. This is the main loop of the component search, and will be quite similar to the implementation of top_down_search from Problem 2(c). There are a few points at which it's noted that you must call a specific function - be sure to do so, as it's key to automatically incorporating some extensions later. The pseudocode is given below: You can test your implementation with the following command. It will only find components that don't use variables, so it will just find various large, common subtrees. julia --project test/q3c-novars.jl However, through component_search/vars.jl, we've added support for variables which should allow for some more interesting components. You shouldn't have to make any changes unless you run into issues. Run it with the following command: julia --project test/q3c-vars.jl

Problem 3.d

Implement the following functions in component_search/branch_and_bound.jl: You can see the effect of pruning even on the small set of graphics programs with the following command: julia --project test/q3d-base.jl

Problem 3.e (optional, ungraded)

There's no implementation in this section, just a few extra commands to try. The pruning from the previous part also allows us to scale up to the nuts-bolts domain used in the Stitch paper (originally from Wong et al. 2022). You can run this domain with the following command (note that this doesn't run the without-pruning version as that fails to terminate): julia --project test/q3d-scaling.jl Finally, equipped with a component discovery algorithm, we can recreate a simple version of the DreamCoder synthesis loop, alternating between component discovery to add new components to the grammar and synthesis to find new solutions to run component discovery on. You can run it with the following command: julia --project test/q3d-dreamcoder.jl If you have multiple cores, you can add a julia -t [num_threads] flag to the command to use multiple threads. You should see that the first 5 tasks are solved in the first iteration. A polygon drawing component is learned which is then used to solve tasks 7-8. From these, a chain of polygons component is learned. The discovery of the polygon component brings the search closer to solving the last few tasks which have to do with rotational symmetry, but the search as-is is not quite able to solve them.

You can use the following command to run all tests from Problems 2 and 3 (except for the dreamcoder test since it takes longer):

julia --project src/run_all_tests.jl