MASSACHVSETTS INSTITVTE OF TECHNOLOGY Department of Electrical Engineering and Computer Science 6.891 Spring 2008 Problem Set 6 Issued: Wed. 12 Mar. 2008 Due: Wed. 19 Mar. 2008 Reading: Lots of code! Code: search.scm, eq-prop.scm, eq-sets.scm, queue.scm, graph.scm, maze.scm, load.scm (attached). Organizing Search Strategies Most complex problems have a component of search, because problem solvers must make choices among alternatives that are not totally determined until the problem under consideration has had more details elaborated. For example, in solving a numerical or an algebraic problem one may have to choose a sign for a square root before enough is known to rule out one of the choices. Or in the analysis of an electrical circuit one may have to assume states (on or off) for diodes in the circuit in order to complete an analysis that may ultimately prove (by finding a contradiction) that the states assumed cannot be the correct ones in the situation specified. Unfortunately, if a search strategy is not incorporated into a program neatly it can become "the tail that wags the dog," distorting the rest of the program so that one cannot easily modify the search strategy without rewriting the whole program. In these exercises we begin to investigate the extent to which a search strategy can be separated from the other parts of a program, so that one can interchange search strategies without greatly modifying the program. We will also see how a program can be enhanced by clever integration of a search strategy. Example: Running a maze Imagine the problem faced by a rat who runs a maze looking for rewards that are stashed at particular places in the maze. We can abstract this problem as a graph with nodes representing the places in the maze and edges connecting the places. We can arrange, with trapdoors, that some paths can be traversed in only one direction. Thus we build a directed graph. So if a path can be run in both directions we must put in two edges to describe that possibility. Now the rat has to think about the order in which it will explore the maze. Once it finds the goodies at one node, it needs to be able to continue searching, because there may be goodies in other places. It also must have a way of avoiding running in circles, never finding anything, and it must have a way to know when it has exhausted the maze and give up. We will have to think about all of these things, and we will discover that we need to make a quantum rat, not a classical one. Consider the following maze: START ^ | | ----- ----- | | V --------A<---------B<------>D* | ^ | ^ | | | | | | ------- | | --- | | V \ V | *E----------->C--------------- ^ | | | | V -------------F This maze has seven nodes and twelve edges. (The path between nodes B and D is bidirectional, so we need two edges to represent it.) The rat is dropped into the maze at START and the reward is present at nodes D and E. (This is indicated by an asterisk.) Representing the maze We represent a maze as a directed graph by its nodes and edges. Exactly how that is done should not be relevant to the search engine. We insulate the search engine from the representation of the graph by an abstraction. Given a node of the graph, the search engine needs only know how to find the edges leading out of the node (and perhaps into the node) and given an edge the search engine needs only know the target node (and perhaps the source node) of the edge. The only other constraint on the representation is that the search engine must be able to tell if it has encountered a node previously. This requires that the node be unambiguously identifiable. So we specify that the means for representing the graph must support the following procedures: (edge->target ) ==> a node (edge->source ) ==> a node (node->out-edges ) ==> a list of edges Also, in a Lisp environment, it is convenient for nodes that are the same to satisfy the eq? relation. This provides an effective way to determine identity. Our machinery for constructing graphs is presented in the file graph.scm, but this file can be replaced with any graph-constructor that may be needed, so long as it satisfies the API described above. Graphs are constructed using three procedures: (make-node ) ==> a node (make-edge ) ==> an edge (make-graph ... ) ==> a graph So, for example, to construct a new instance of the maze described above we can make a procedure: (define (setup-a-maze) (let ((start (make-node 'start)) (a (make-node 'aa)) (b (make-node 'bb)) (c (make-node 'cc)) (d (make-node 'dd)) (e (make-node 'ee)) (f (make-node 'ff))) (make-edge start b) (make-edge b a) (make-edge a start) (make-edge b d) (make-edge d b) (make-edge a e) (make-edge c a) (make-edge b c) (make-edge c d) (make-edge e c) (make-edge c f) (make-edge f e) (eq-put! d 'food '*) (eq-put! e 'food '*) (make-graph start a b c d e f))) This procedure builds a graph with nodes named "aa", "bb", etc. (These names are chosen so as to avoid confusing the reader with the local variable names "a", "b", etc., which have those nodes as their values in this procedure.) We also use the "property-list" mechanism (see the attached file eq-prop.scm for the definition) to mark the food-bearing nodes. This particular maze is specified in the file maze.scm, along with some example rats that can search it. Searching the maze The general method of searching a maze from a particular node is summarized as follows: If the current node is one that we want, return it, with a way of proceeding to find more, otherwise proceed: Choose one of the links that we can proceed from and use it to get its target node. Make that the current node, if possible, and iterate. If we run out of such links, fail. This description does not specify a particular strategy. A depth-first strategy specifies that the first links we should try are descendents of the current node (the ones emanating from the current node), and if we run out of those we should go back to the node we came from and try the unexplored links emanating from that parent node. We backtrack to parents until we run out of unexplored links from the initial node. The fact that we can backtrack to a parent node (apparently traversing an edge in the wrong direction) is the reason why our rat is a quantum rat rather than a classical rat! A breadth-first strategy, by contrast, specifies that we must explore sibling links of the current node before descendent links of the current node. (A sibling link is a link emanating from the parent of the current node.) Of course, there are many other strategies, which can depend on specific properties of the nodes and links, that can be implemented. The file search.scm contains a general graph-searching procedure, graph:search, which takes three required arguments and two optional arguments. The required arguments are an initial node to start searching from, a predicate that will be true only if the node given to it is good enough to satisfy the search, and a continuation procedure that takes a satisfactory node and a procedure to run if the search is to be continued. The optional arguments are (1) how edges are to be added to the agenda, allowing various search strategies, and (2) what action is to be taken, if any, upon traversing an edge. The default new-edge strategy is to add the new edges to the end of a FIFO, yielding a breadth-first search. We provide an alternate strategy (see depth-first-add-edges! in the same file) that adds the edges in a LIFO fashion, producing a depth-first search strategy. Of course, other strategies may be implemented and passed in for that optional argument. The default edge-traversal procedure, graph:min-edges, maintains a record of the minimum distance, in number of edges traversed, from the start node to any node encountered in the search. This is also done with property lists. Note that if the global variable graph:search-wallp is not false, a line is written to the output for every edge traversed. (By long tradition such debugging transcripts are called "wallpaper", hence the name "wallp".) Look at the examples at the end of file maze.scm. ------------- Problem 6.1: The rats manage to get all of the food (at D and E). How is that done? Explain this in one short sentence. The mole rat (depth first) reports that it found food at node dd, which was 5 edges away from the start. But after it finished looking for all of the food we found that the distance was really only two edges: (eq-get (graph->node maze-2 'dd) (graph->node maze-2 'start)) ;Value: 2 Explain this behavior in a few short sentences. ------------- The graph-construction mechanism we provided in graph.scm is good for making explicit finite graphs, such as encountered by our rat. But for many problems the graph is not finite, or it is not apparent how to construct except by discovery in the search. For example, the problem of finding Pythagorean triples (triples of integers a, b, c such that a^2+b^2=c^2) can be thought of as a graph search, where the nodes are particular triples of numbers and the edges are increments to a or b or c. Now the search engine can never look at more than a finite piece of the graph. All that matters is that the piece that is looked at exists when needed! ------------- Problem 6.2: Write a different version of graph.scm that supports the API needed by the search engine, but which provides for the lazy incremental construction of an implicit graph. Demonstrate your code by using it to find Ramanujan numbers (numbers that are the sum of two cubes in two different ways). If you don't like Ramanujan numbers, pick a classic puzzle, such as the "Missionaries and Cannibals" (see www.jimloy.com/puzz/cannibal.htm), or any other one that you find fun. Represent the puzzle as an implicit graph and demonstrate how to solve it. ------------- ------------- Problem 6.3: Often in a game like chess we want a "plausible-move generator" that sorts the legal moves so as to arrange that locally better moves are considered first. We also want a "static board evaluator" that evaluates positions for their value. Can the searching paradigm provided here be used or extended to incorporate such ideas without breaking abstraction barriers? For example, can you add information to the nodes and edges and provide a new-edge strategy to make a best-first-search that gets better performance on your problem? Explain your opinions on this matter in a coherent paragraph or two. You may illustrate your explanation with code, if that is useful. ------------- ------------- Problem 6.4: (optional!) If you are feeling in a "matcho" programming mood, pick a complex game, where you cannot enumerate the states explicitly. (Examples of such games are Chess, Go, Backgammon, various card games, etc.) Express its legal moves in terms of a node and edge generator. (You probably want to make a very simplified version of one of these games... Chess is too big a project for a weekend hackathon, unless your name is Richard Greenblatt!). Demonstrate your code. WARNING: This is not a small job: don't do it if it prevents you from studying for some other class! We have known people to spend months on problems like this. On the other hand, something like this could be an interesting term project. ------------- ;;;; File: search.scm ;;;; Searching a Directed Graph ;;; A directed graph has nodes and edges. From a given edge we can ;;; get the source node and the target node. From a given node we can ;;; get the incoming edges and the outgoing edges. ;;; The search procedure is given a graph, an initial node, a ;;; predicate for accepting a node, and a continuation that takes ;;; an accepted node and a procedure to call to reject the given ;;; node. ;;; In this system nodes must have identity. That is, equivalent ;;; nodes must be eq? This is necessary to allow us to mark them and ;;; otherwise annotate them with path-dependent information that may ;;; be discovered in search. (define (graph:search initial-node accept-node? continue #!optional new-edges! edge-action) (if (default-object? edge-action) (set! edge-action (graph:min-edges initial-node))) (if (default-object? new-edges!) (set! new-edges! breadth-first-add-edges!)) ;; continue = (lambda (node get-another) ...) (let ((pending-edges (empty-pending-edges)) (new-node?! (new-node-recorder))) (define (consider-node node) (define (reject-this-node) (and (there-are-edges? pending-edges) (let* ((chosen-edge (get-edge! pending-edges)) (chosen-node (edge->target chosen-edge))) (and graph:search-wallp (write-line (list 'edge (node->name (edge->source chosen-edge)) (node->name (edge->target chosen-edge)) (length (queue:front pending-edges))))) (edge-action chosen-edge) (consider-node chosen-node)))) (if (new-node?! node) (begin (new-edges! pending-edges (node->out-edges node)) (if (accept-node? node) (continue node reject-this-node) (reject-this-node))) (reject-this-node))) (consider-node initial-node))) (define graph:search-wallp #f) ;;; A handy state accumulator (define (graph:min-edges initial-node) (eq-put! initial-node initial-node 0) (lambda (edge) (let ((source (edge->source edge)) (target (edge->target edge))) (let ((sd (eq-get source initial-node)) (td (eq-get target initial-node))) (if td (eq-put! target initial-node (min td (+ sd 1))) (eq-put! target initial-node (+ sd 1))))))) ;;; Search data structures (define (new-node-recorder) (let ((table (make-eq-hash-table))) (define (new-node?! node) (if (hash-table/get table node #f) #f (begin (hash-table/put! table node #t) #t))) new-node?!)) ;;; Edges are arranged in a traditional Lisp queue... (define (empty-pending-edges) (queue:make)) (define (there-are-edges? edges) (not (queue:empty? edges))) (define (get-edge! edges) (let ((v (queue:first edges))) (queue:delete-first! edges) v)) ;;; Search control is determined by where new edges are inserted: (define (breadth-first-add-edges! edges new-edges) (for-each (lambda (new-edge) (queue:insert-rear! edges new-edge)) new-edges)) (define (depth-first-add-edges! edges new-edges) (and new-edges (for-each (lambda (new-edge) (queue:insert-front! edges new-edge)) new-edges))) ;;;; File: eq-prop.scm ;;;; Traditional LISP property lists ;;; extended to work on any kind of eq? ;;; data structure. (declare (usual-integrations)) (define eq-properties (make-eq-hash-table)) (define (eq-put! node property value) (let ((plist (hash-table/get eq-properties node #f))) (if plist (let ((vcell (assq property (cdr plist)))) (if vcell (set-cdr! vcell value) (set-cdr! plist (cons (cons property value) (cdr plist))))) (hash-table/put! eq-properties node (list node (cons property value))))) 'done) (define (eq-get node property) (let ((plist (hash-table/get eq-properties node #f))) (if plist (let ((vcell (assq property (cdr plist)))) (if vcell (cdr vcell) #f)) #f))) (define (eq-rem! node property) (let ((plist (hash-table/get eq-properties node #f))) (if plist (let ((vcell (assq property (cdr plist)))) (if vcell (hash-table/put! eq-properties node (delq! vcell plist)))))) 'done) (define (eq-adjoin! node property new) (eq-put! node property (eq-set/adjoin new (or (eq-get node property) '()))) 'done) (define (eq-delete! node property item) (eq-put! node property (eq-set/remove item (or (eq-get node property) '()))) 'done) (define (eq-plist node) (hash-table/get eq-properties node #f)) (define (eq-path path) (define (lp node) (if node (if (pair? path) (eq-get ((eq-path (cdr path)) node) (car path)) node) #f)) lp) ;;;; File: eq-sets.scm ;;;; eq-set utilities from Jinx [a.k.a. GJR] (define-integrable (eq-set/make-empty) '()) (define-integrable (eq-set/empty? set) (null? set)) (define-integrable (eq-set/member? element set) (memq element set)) (define-integrable (eq-set/adjoin element set) (if (eq-set/member? element set) set (cons element set))) (define (eq-set/remove element set) (if (not (eq-set/member? element set)) set (delq element set))) ;; Important: This will return set2 itself (rather than a copy) if the ;; union is set2. Thus eq? can be used on the return value to ;; determine whether the set has grown. (define (eq-set/union set1 set2) (define (loop set new-elements) (if (null? new-elements) set (loop (eq-set/adjoin (car new-elements) set) (cdr new-elements)))) ;; If set2 is smaller than set1, the union is guaranteed not to be set2. (if (< (length set2) (length set1)) (loop set1 set2) (loop set2 set1))) (define (eq-set/intersection set1 set2) (define (examine set1 set2) (let process ((set #| (reverse set1) |# set1) (result (eq-set/make-empty))) (if (null? set) result (process (cdr set) (if (eq-set/member? (car set) set2) (cons (car set) result) result))))) (if (< (length set2) (length set1)) (examine set2 set1) (examine set1 set2))) (define (eq-set/difference set1 set2) (if (null? set2) set1 (let process ((set set1) (result (eq-set/make-empty))) (cond ((null? set) result) ((eq-set/member? (car set) set2) (process (cdr set) result)) (else (process (cdr set) (cons (car set) result))))))) (define (eq-set/subset? set1 set2) (or (eq-set/empty? set1) (and (eq-set/member? (car set1) set2) (eq-set/subset? (cdr set1) set2)))) (define (eq-set/equal? set1 set2) (or (eq? set1 set2) (and (eq-set/subset? set1 set2) (eq-set/subset? set2 set1)))) ;;; File: queue.scm ;;; Traditional Lisp queue data structure. (define-record-type (%make-queue front rear) queue? (front queue:front queue:set-front!) (rear queue:rear queue:set-rear!)) (define (queue:make) (%make-queue '() '())) (define (queue:empty? q) (null? (queue:front q))) (define (queue:first q) (if (queue:empty? q) (error "Queue empty -- first") (car (queue:front q)))) (define (queue:delete-first! q) (if (queue:empty? q) (error "Queue empty -- delete-first!") (queue:set-front! q (cdr (queue:front q)))) 'done) (define (queue:insert-front! q item) (cond ((queue:empty? q) (let ((new-pair (cons item '()))) (queue:set-front! q new-pair) (queue:set-rear! q new-pair))) (else (queue:set-front! q (cons item (queue:front q))))) 'done) (define (queue:insert-rear! q item) (let ((new-pair (cons item '()))) (cond ((queue:empty? q) (queue:set-front! q new-pair) (queue:set-rear! q new-pair)) (else (set-cdr! (queue:rear q) new-pair) (queue:set-rear! q new-pair)))) 'done) ;;;; File: graph.scm ;;;; Representation of graphs is chosen here (define-record-type (make-node name) node? (name node->name set-node-name!) (graph node->graph set-node-graph!) (out-edges %node->out-edges set-node-out-edges!) (in-edges %node->in-edges set-node-in-edges!)) (define (node->out-edges node) (or (%node->out-edges node) '())) (define (node->in-edges node) (or (%node->in-edges node) '())) (define-record-type (%make-edge source target) edge? (source edge->source) (target edge->target)) (define (make-edge source target) (if (and (node? source) (node? target)) (let ((edge (%make-edge source target))) (set-node-out-edges! source (cons edge (node->out-edges source))) (set-node-in-edges! source (cons edge (node->in-edges source))) edge) (error "Edges connect nodes -- make-edge" source target))) (define-record-type (%make-graph nodes) graph? (nodes graph->nodes)) (define (make-graph . nodes) (let ((g (%make-graph nodes))) (for-each (lambda (node) (if (node? node) (set-node-graph! node g) (error "Bad node -- make-graph" node))) nodes) g)) (define (graph->node graph node-name) (find-matching-item (graph->nodes graph) (lambda (node) (eqv? node-name (node->name node))))) ;;;; File: maze.scm ;;;; Representing the maze using the mechanism of graph.scm. (define (setup-a-maze) (let ((start (make-node 'start)) (a (make-node 'aa)) (b (make-node 'bb)) (c (make-node 'cc)) (d (make-node 'dd)) (e (make-node 'ee)) (f (make-node 'ff))) (make-edge start b) (make-edge b a) (make-edge a start) (make-edge b d) (make-edge d b) (make-edge a e) (make-edge c a) (make-edge b c) (make-edge c d) (make-edge e c) (make-edge c f) (make-edge f e) (eq-put! d 'food '*) (eq-put! e 'food '*) (make-graph start a b c d e f))) ;;; Our rats run through the maze to find all of the food. (define (fat-rat maze) ;; Default is breadth-first (let ((start-node (graph->node maze 'start))) (graph:search start-node (lambda (node) (eq-get node 'food)) (lambda (found proceed) (write-line `(start --> ,(node->name found) = ,(eq-get found start-node))) (proceed))))) (define (mole-rat maze) ;; Specify depth-first (let ((start-node (graph->node maze 'start))) (graph:search start-node (lambda (node) (eq-get node 'food)) (lambda (found proceed) (write-line `(start --> ,(node->name found) = ,(eq-get found start-node))) (proceed)) depth-first-add-edges!))) #| ;;; Let's run our rats, and watch their progress. (set! graph:search-wallp #t) (define maze-1 (setup-a-maze)) (fat-rat maze-1) (edge start bb 0) (edge bb cc 2) (edge bb dd 4) (start --> dd = 2) (edge bb aa 4) (edge cc ff 5) (edge cc dd 5) (edge cc aa 4) (edge dd bb 3) (edge aa ee 2) (start --> ee = 3) (edge aa start 2) (edge ff ee 1) (edge ee cc 0) ;Value: #f (define maze-2 (setup-a-maze)) (mole-rat maze-2) (edge start bb 0) (edge bb aa 2) (edge aa start 3) (edge aa ee 2) (start --> ee = 3) (edge ee cc 2) (edge cc aa 4) (edge cc dd 3) (start --> dd = 5) (edge dd bb 3) (edge cc ff 2) (edge ff ee 2) (edge bb dd 1) (edge bb cc 0) ;Value: #f ;;; Notice that the distance from start to dd was 5 edges, ;;; but after the run was over it was only 2 edges. Why? (eq-get (graph->node maze-2 'dd) (graph->node maze-2 'start)) ;Value: 2 |# ;;;; File: load.scm (load "search") ;;; Support for the search engine (load "eq-prop") (load "eq-sets") (load "queue") ;;; Representation of graphs (load "graph") ;;; The rat example (load "maze")