Next: , Previous: , Up: Other Packages   [Contents][Index]

7.2 Sorting and Searching


7.2.1 Common List Functions

(require 'common-list-functions)

The procedures below follow the Common LISP equivalents apart from optional arguments in some cases.


7.2.1.1 List construction

Function: make-list k
Function: make-list k init

make-list creates and returns a list of k elements. If init is included, all elements in the list are initialized to init.

Example:

(make-list 3)
   ⇒ (#<unspecified> #<unspecified> #<unspecified>)
(make-list 5 'foo)
   ⇒ (foo foo foo foo foo)
Function: list* obj1 obj2 …

Works like list except that the cdr of the last pair is the last argument unless there is only one argument, when the result is just that argument. Sometimes called cons*. E.g.:

(list* 1)
   ⇒ 1
(list* 1 2 3)
   ⇒ (1 2 . 3)
(list* 1 2 '(3 4))
   ⇒ (1 2 3 4)
(list* args '())
   ≡ (list args)
Function: copy-list lst

copy-list makes a copy of lst using new pairs and returns it. Only the top level of the list is copied, i.e., pairs forming elements of the copied list remain eq? to the corresponding elements of the original; the copy is, however, not eq? to the original, but is equal? to it.

Example:

(copy-list '(foo foo foo))
   ⇒ (foo foo foo)
(define q '(foo bar baz bang))
(define p q)
(eq? p q)
   ⇒ #t
(define r (copy-list q))
(eq? q r)
   ⇒ #f
(equal? q r)
   ⇒ #t
(define bar '(bar))
(eq? bar (car (copy-list (list bar 'foo))))
⇒ #t
   

7.2.1.2 Lists as sets

eqv? is used to test for membership by procedures which treat lists as sets.

Function: adjoin e l

adjoin returns the adjoint of the element e and the list l. That is, if e is in l, adjoin returns l, otherwise, it returns (cons e l).

Example:

(adjoin 'baz '(bar baz bang))
   ⇒ (bar baz bang)
(adjoin 'foo '(bar baz bang))
   ⇒ (foo bar baz bang)
Function: union l1 l2

union returns a list of all elements that are in l1 or l2. Duplicates between l1 and l2 are culled. Duplicates within l1 or within l2 may or may not be removed.

Example:

(union '(1 2 3 4) '(5 6 7 8))
   ⇒ (1 2 3 4 5 6 7 8)
(union '(0 1 2 3 4) '(3 4 5 6))
   ⇒ (5 6 0 1 2 3 4)
Function: intersection l1 l2

intersection returns a list of all elements that are in both l1 and l2.

Example:

(intersection '(1 2 3 4) '(3 4 5 6))
   ⇒ (3 4)
(intersection '(1 2 3 4) '(5 6 7 8))
   ⇒ ()
Function: set-difference l1 l2

set-difference returns a list of all elements that are in l1 but not in l2.

Example:

(set-difference '(1 2 3 4) '(3 4 5 6))
   ⇒ (1 2)
(set-difference '(1 2 3 4) '(1 2 3 4 5 6))
   ⇒ ()
Function: subset? list1 list2

Returns #t if every element of list1 is eqv? an element of list2; otherwise returns #f.

Example:

(subset? '(1 2 3 4) '(3 4 5 6))
   ⇒ #f
(subset? '(1 2 3 4) '(6 5 4 3 2 1 0))
   ⇒ #t
Function: member-if pred lst

member-if returns the list headed by the first element of lst to satisfy (pred element). Member-if returns #f if pred returns #f for every element in lst.

Example:

(member-if vector? '(a 2 b 4))
   ⇒ #f
(member-if number? '(a 2 b 4))
   ⇒ (2 b 4)
Function: some pred lst1 lst2 …

pred is a boolean function of as many arguments as there are list arguments to some i.e., lst plus any optional arguments. pred is applied to successive elements of the list arguments in order. some returns #t as soon as one of these applications returns #t, and is #f if none returns #t. All the lists should have the same length.

Example:

(some odd? '(1 2 3 4))
   ⇒ #t

(some odd? '(2 4 6 8))
   ⇒ #f

(some > '(1 3) '(2 4))
   ⇒ #f
Function: every pred lst1 lst2 …

every is analogous to some except it returns #t if every application of pred is #t and #f otherwise.

Example:

(every even? '(1 2 3 4))
   ⇒ #f

(every even? '(2 4 6 8))
   ⇒ #t

(every > '(2 3) '(1 4))
   ⇒ #f
Function: notany pred lst1 …

notany is analogous to some but returns #t if no application of pred returns #t or #f as soon as any one does.

Function: notevery pred lst1 …

notevery is analogous to some but returns #t as soon as an application of pred returns #f, and #f otherwise.

Example:

(notevery even? '(1 2 3 4))
   ⇒ #t

(notevery even? '(2 4 6 8))
   ⇒ #f
Function: list-of?? predicate

Returns a predicate which returns true if its argument is a list every element of which satisfies predicate.

Function: list-of?? predicate low-bound high-bound

low-bound and high-bound are non-negative integers. list-of?? returns a predicate which returns true if its argument is a list of length between low-bound and high-bound (inclusive); every element of which satisfies predicate.

Function: list-of?? predicate bound

bound is an integer. If bound is negative, list-of?? returns a predicate which returns true if its argument is a list of length greater than (- bound); every element of which satisfies predicate. Otherwise, list-of?? returns a predicate which returns true if its argument is a list of length less than or equal to bound; every element of which satisfies predicate.

Function: find-if pred lst

find-if searches for the first element in lst such that (pred element) returns #t. If it finds any such element in lst, element is returned. Otherwise, #f is returned.

Example:

(find-if number? '(foo 1 bar 2))
   ⇒ 1

(find-if number? '(foo bar baz bang))
   ⇒ #f

(find-if symbol? '(1 2 foo bar))
   ⇒ foo
Function: remove elt lst

remove removes all occurrences of elt from lst using eqv? to test for equality and returns everything that’s left. N.B.: other implementations (Chez, Scheme->C and T, at least) use equal? as the equality test.

Example:

(remove 1 '(1 2 1 3 1 4 1 5))
   ⇒ (2 3 4 5)

(remove 'foo '(bar baz bang))
   ⇒ (bar baz bang)
Function: remove-if pred lst

remove-if removes all elements from lst where (pred element) is #t and returns everything that’s left.

Example:

(remove-if number? '(1 2 3 4))
   ⇒ ()

(remove-if even? '(1 2 3 4 5 6 7 8))
   ⇒ (1 3 5 7)
Function: remove-if-not pred lst

remove-if-not removes all elements from lst for which (pred element) is #f and returns everything that’s left.

Example:

(remove-if-not number? '(foo bar baz))
   ⇒ ()
(remove-if-not odd? '(1 2 3 4 5 6 7 8))
   ⇒ (1 3 5 7)
Function: has-duplicates? lst

returns #t if 2 members of lst are equal?, #f otherwise.

Example:

(has-duplicates? '(1 2 3 4))
   ⇒ #f

(has-duplicates? '(2 4 3 4))
   ⇒ #t

The procedure remove-duplicates uses member (rather than memv).

Function: remove-duplicates lst

returns a copy of lst with its duplicate members removed. Elements are considered duplicate if they are equal?.

Example:

(remove-duplicates '(1 2 3 4))
   ⇒ (1 2 3 4)

(remove-duplicates '(2 4 3 4))
   ⇒ (2 4 3)

7.2.1.3 Lists as sequences

Function: position obj lst

position returns the 0-based position of obj in lst, or #f if obj does not occur in lst.

Example:

(position 'foo '(foo bar baz bang))
   ⇒ 0
(position 'baz '(foo bar baz bang))
   ⇒ 2
(position 'oops '(foo bar baz bang))
   ⇒ #f
Function: reduce p lst

reduce combines all the elements of a sequence using a binary operation (the combination is left-associative). For example, using +, one can add up all the elements. reduce allows you to apply a function which accepts only two arguments to more than 2 objects. Functional programmers usually refer to this as foldl. collect:reduce (see Collections) provides a version of collect generalized to collections.

Example:

(reduce + '(1 2 3 4))
   ⇒ 10
(define (bad-sum . l) (reduce + l))
(bad-sum 1 2 3 4)
   ≡ (reduce + (1 2 3 4))
   ≡ (+ (+ (+ 1 2) 3) 4)
⇒ 10
(bad-sum)
   ≡ (reduce + ())
   ⇒ ()
(reduce string-append '("hello" "cruel" "world"))
   ≡ (string-append (string-append "hello" "cruel") "world")
   ⇒ "hellocruelworld"
(reduce anything '())
   ⇒ ()
(reduce anything '(x))
   ⇒ x

What follows is a rather non-standard implementation of reverse in terms of reduce and a combinator elsewhere called C.

;;; Contributed by Jussi Piitulainen (jpiitula @ ling.helsinki.fi)

(define commute
  (lambda (f)
    (lambda (x y)
      (f y x))))

(define reverse
  (lambda (args)
    (reduce-init (commute cons) '() args)))
Function: reduce-init p init lst

reduce-init is the same as reduce, except that it implicitly inserts init at the start of the list. reduce-init is preferred if you want to handle the null list, the one-element, and lists with two or more elements consistently. It is common to use the operator’s idempotent as the initializer. Functional programmers usually call this foldl.

Example:

(define (sum . l) (reduce-init + 0 l))
(sum 1 2 3 4)
   ≡ (reduce-init + 0 (1 2 3 4))
   ≡ (+ (+ (+ (+ 0 1) 2) 3) 4)
   ⇒ 10
(sum)
   ≡ (reduce-init + 0 '())
   ⇒ 0

(reduce-init string-append "@" '("hello" "cruel" "world"))
≡
(string-append (string-append (string-append "@" "hello")
                               "cruel")
               "world")
⇒ "@hellocruelworld"

Given a differentiation of 2 arguments, diff, the following will differentiate by any number of variables.

(define (diff* exp . vars)
  (reduce-init diff exp vars))

Example:

;;; Real-world example:  Insertion sort using reduce-init.

(define (insert l item)
  (if (null? l)
      (list item)
      (if (< (car l) item)
          (cons (car l) (insert (cdr l) item))
          (cons item l))))
(define (insertion-sort l) (reduce-init insert '() l))

(insertion-sort '(3 1 4 1 5)
   ≡ (reduce-init insert () (3 1 4 1 5))
   ≡ (insert (insert (insert (insert (insert () 3) 1) 4) 1) 5)
   ≡ (insert (insert (insert (insert (3)) 1) 4) 1) 5)
   ≡ (insert (insert (insert (1 3) 4) 1) 5)
   ≡ (insert (insert (1 3 4) 1) 5)
   ≡ (insert (1 1 3 4) 5)
   ⇒ (1 1 3 4 5)
   
Function: last lst n

last returns the last n elements of lst. n must be a non-negative integer.

Example:

(last '(foo bar baz bang) 2)
   ⇒ (baz bang)
(last '(1 2 3) 0)
   ⇒ ()
Function: butlast lst n

butlast returns all but the last n elements of lst.

Example:

(butlast '(a b c d) 3)
   ⇒ (a)
(butlast '(a b c d) 4)
   ⇒ ()

last and butlast split a list into two parts when given identical arguments.

(last '(a b c d e) 2)
   ⇒ (d e)
(butlast '(a b c d e) 2)
   ⇒ (a b c)
Function: nthcdr n lst

nthcdr takes n cdrs of lst and returns the result. Thus (nthcdr 3 lst)(cdddr lst)

Example:

(nthcdr 2 '(a b c d))
   ⇒ (c d)
(nthcdr 0 '(a b c d))
   ⇒ (a b c d)
Function: butnthcdr n lst

butnthcdr returns all but the nthcdr n elements of lst.

Example:

(butnthcdr 3 '(a b c d))
   ⇒ (a b c)
(butnthcdr 4 '(a b c d))
   ⇒ (a b c d)

nthcdr and butnthcdr split a list into two parts when given identical arguments.

(nthcdr 2 '(a b c d e))
   ⇒ (c d e)
(butnthcdr 2 '(a b c d e))
   ⇒ (a b)
Function: butnth n lst

butnth returns a list of all but the nth element of lst.

Example:

(butnth 2 '(a b c d))
   ⇒ (a b d)
(butnth 4 '(a b c d))
   ⇒ (a b c d)

7.2.1.4 Destructive list operations

These procedures may mutate the list they operate on, but any such mutation is undefined.

Procedure: nconc args

nconc destructively concatenates its arguments. (Compare this with append, which copies arguments rather than destroying them.) Sometimes called append! (see Rev2 Procedures).

Example: You want to find the subsets of a set. Here’s the obvious way:

(define (subsets set)
  (if (null? set)
      '(())
      (append (map (lambda (sub) (cons (car set) sub))
                   (subsets (cdr set)))
              (subsets (cdr set)))))

But that does way more consing than you need. Instead, you could replace the append with nconc, since you don’t have any need for all the intermediate results.

Example:

(define x '(a b c))
(define y '(d e f))
(nconc x y)
   ⇒ (a b c d e f)
x
   ⇒ (a b c d e f)

nconc is the same as append! in sc2.scm.

Procedure: nreverse lst

nreverse reverses the order of elements in lst by mutating cdrs of the list. Sometimes called reverse!.

Example:

(define foo '(a b c))
(nreverse foo)
   ⇒ (c b a)
foo
   ⇒ (a)

Some people have been confused about how to use nreverse, thinking that it doesn’t return a value. It needs to be pointed out that

(set! lst (nreverse lst))

is the proper usage, not

(nreverse lst)

The example should suffice to show why this is the case.

Procedure: delete elt lst
Procedure: delete-if pred lst
Procedure: delete-if-not pred lst

Destructive versions of remove remove-if, and remove-if-not.

Example:

(define lst (list 'foo 'bar 'baz 'bang))
(delete 'foo lst)
   ⇒ (bar baz bang)
lst
   ⇒ (foo bar baz bang)

(define lst (list 1 2 3 4 5 6 7 8 9))
(delete-if odd? lst)
   ⇒ (2 4 6 8)
lst
   ⇒ (1 2 4 6 8)

Some people have been confused about how to use delete, delete-if, and delete-if, thinking that they don’t return a value. It needs to be pointed out that

(set! lst (delete el lst))

is the proper usage, not

(delete el lst)

The examples should suffice to show why this is the case.


7.2.1.5 Non-List functions

Function: and? arg1 …

and? checks to see if all its arguments are true. If they are, and? returns #t, otherwise, #f. (In contrast to and, this is a function, so all arguments are always evaluated and in an unspecified order.)

Example:

(and? 1 2 3)
   ⇒ #t
(and #f 1 2)
   ⇒ #f
Function: or? arg1 …

or? checks to see if any of its arguments are true. If any is true, or? returns #t, and #f otherwise. (To or as and? is to and.)

Example:

(or? 1 2 #f)
   ⇒ #t
(or? #f #f #f)
   ⇒ #f
Function: atom? object

Returns #t if object is not a pair and #f if it is pair. (Called atom in Common LISP.)

(atom? 1)
   ⇒ #t
(atom? '(1 2))
   ⇒ #f
(atom? #(1 2))   ; dubious!
   ⇒ #t

7.2.2 Tree operations

(require 'tree)

These are operations that treat lists a representations of trees.

Function: subst new old tree
Function: substq new old tree
Function: substv new old tree
Function: subst new old tree equ?

subst makes a copy of tree, substituting new for every subtree or leaf of tree which is equal? to old and returns a modified tree. The original tree is unchanged, but may share parts with the result.

substq and substv are similar, but test against old using eq? and eqv? respectively. If subst is called with a fourth argument, equ? is the equality predicate.

Examples:

(substq 'tempest 'hurricane '(shakespeare wrote (the hurricane)))
   ⇒ (shakespeare wrote (the tempest))
(substq 'foo '() '(shakespeare wrote (twelfth night)))
   ⇒ (shakespeare wrote (twelfth night . foo) . foo)
(subst '(a . cons) '(old . pair)
       '((old . spice) ((old . shoes) old . pair) (old . pair)))
   ⇒ ((old . spice) ((old . shoes) a . cons) (a . cons))
Function: copy-tree tree

Makes a copy of the nested list structure tree using new pairs and returns it. All levels are copied, so that none of the pairs in the tree are eq? to the original ones – only the leaves are.

Example:

(define bar '(bar))
(copy-tree (list bar 'foo))
   ⇒ ((bar) foo)
(eq? bar (car (copy-tree (list bar 'foo))))
   ⇒ #f

7.2.3 Chapter Ordering

(require 'chapter-order)

The ‘chap:’ functions deal with strings which are ordered like chapter numbers (or letters) in a book. Each section of the string consists of consecutive numeric or consecutive aphabetic characters of like case.

Function: chap:string<? string1 string2

Returns #t if the first non-matching run of alphabetic upper-case or the first non-matching run of alphabetic lower-case or the first non-matching run of numeric characters of string1 is string<? than the corresponding non-matching run of characters of string2.

(chap:string<? "a.9" "a.10")                    ⇒ #t
(chap:string<? "4c" "4aa")                      ⇒ #t
(chap:string<? "Revised^{3.99}" "Revised^{4}")  ⇒ #t
Function: chap:string>? string1 string2
Function: chap:string<=? string1 string2
Function: chap:string>=? string1 string2

Implement the corresponding chapter-order predicates.

Function: chap:next-string string

Returns the next string in the chapter order. If string has no alphabetic or numeric characters, (string-append string "0") is returnd. The argument to chap:next-string will always be chap:string<? than the result.

(chap:next-string "a.9")                ⇒ "a.10"
(chap:next-string "4c")                 ⇒ "4d"
(chap:next-string "4z")                 ⇒ "4aa"
(chap:next-string "Revised^{4}")        ⇒ "Revised^{5}"


7.2.4 Sorting

(require 'sort) or (require 'srfi-95)

[by Richard A. O’Keefe, 1991]

I am providing this source code with no restrictions at all on its use (but please retain D.H.D.Warren’s credit for the original idea).

The code of merge and merge! could have been quite a bit simpler, but they have been coded to reduce the amount of work done per iteration. (For example, we only have one null? test per iteration.)

I gave serious consideration to producing Common-LISP-compatible functions. However, Common LISP’s sort is our sort! (well, in fact Common LISP’s stable-sort is our sort!; merge sort is fast as well as stable!) so adapting CL code to Scheme takes a bit of work anyway. I did, however, appeal to CL to determine the order of the arguments.

The standard functions <, >, char<?, char>?, char-ci<?, char-ci>?, string<?, string>?, string-ci<?, and string-ci>? are suitable for use as comparison functions. Think of (less? x y) as saying when x must not precede y.

[Addendum by Aubrey Jaffer, 2006]

These procedures are stable when called with predicates which return #f when applied to identical arguments.

The sorted?, merge, and merge! procedures consume asymptotic time and space no larger than O(N), where N is the sum of the lengths of the sequence arguments. The sort and sort! procedures consume asymptotic time and space no larger than O(N*log(N)), where N is the length of the sequence argument.

All five functions take an optional key argument corresponding to a CL-style ‘&key’ argument. A less? predicate with a key argument behaves like:

(lambda (x y) (less? (key x) (key y)))

All five functions will call the key argument at most once per element.

The ‘!’ variants sort in place; sort! returns its sequence argument.

Function: sorted? sequence less?
Function: sorted? sequence less? key

Returns #t when the sequence argument is in non-decreasing order according to less? (that is, there is no adjacent pair … x y … for which (less? y x)).

Returns #f when the sequence contains at least one out-of-order pair. It is an error if the sequence is not a list or array (including vectors and strings).

Function: merge list1 list2 less?
Function: merge list1 list2 less? key

Merges two sorted lists, returning a freshly allocated list as its result.

Function: merge! list1 list2 less?
Function: merge! list1 list2 less? key

Merges two sorted lists, re-using the pairs of list1 and list2 to build the result. The result will be either list1 or list2.

Function: sort sequence less?
Function: sort sequence less? key

Accepts a list or array (including vectors and strings) for sequence; and returns a completely new sequence which is sorted according to less?. The returned sequence is the same type as the argument sequence. Given valid arguments, it is always the case that:

(sorted? (sort sequence less?) less?) ⇒ #t
Function: sort! sequence less?
Function: sort! sequence less? key

Returns list, array, vector, or string sequence which has been mutated to order its elements according to less?. Given valid arguments, it is always the case that:

(sorted? (sort! sequence less?) less?) ⇒ #t

Next: , Previous: , Up: Sorting and Searching   [Contents][Index]

7.2.5 Topological Sort

(require 'topological-sort) or (require 'tsort)

The algorithm is inspired by Cormen, Leiserson and Rivest (1990) Introduction to Algorithms, chapter 23.

Function: tsort dag pred
Function: topological-sort dag pred

where

dag

is a list of sublists. The car of each sublist is a vertex. The cdr is the adjacency list of that vertex, i.e. a list of all vertices to which there exists an edge from the car vertex.

pred

is one of eq?, eqv?, equal?, =, char=?, char-ci=?, string=?, or string-ci=?.

Sort the directed acyclic graph dag so that for every edge from vertex u to v, u will come before v in the resulting list of vertices.

Time complexity: O (|V| + |E|)

Example (from Cormen):

Prof. Bumstead topologically sorts his clothing when getting dressed. The first argument to tsort describes which garments he needs to put on before others. (For example, Prof Bumstead needs to put on his shirt before he puts on his tie or his belt.) tsort gives the correct order of dressing:

(require 'tsort)
(tsort '((shirt tie belt)
         (tie jacket)
         (belt jacket)
         (watch)
         (pants shoes belt)
         (undershorts pants shoes)
         (socks shoes))
       eq?)
⇒
(socks undershorts pants shoes watch shirt belt tie jacket)

7.2.6 Hashing

(require 'hash)

These hashing functions are for use in quickly classifying objects. Hash tables use these functions.

Function: hashq obj k
Function: hashv obj k
Function: hash obj k

Returns an exact non-negative integer less than k. For each non-negative integer less than k there are arguments obj for which the hashing functions applied to obj and k returns that integer.

For hashq, (eq? obj1 obj2) implies (= (hashq obj1 k) (hashq obj2)).

For hashv, (eqv? obj1 obj2) implies (= (hashv obj1 k) (hashv obj2)).

For hash, (equal? obj1 obj2) implies (= (hash obj1 k) (hash obj2)).

hash, hashv, and hashq return in time bounded by a constant. Notice that items having the same hash implies the items have the same hashv implies the items have the same hashq.


Next: , Previous: , Up: Sorting and Searching   [Contents][Index]

7.2.7 Space-Filling Curves


7.2.7.1 Multidimensional Space-Filling Curves

(require 'space-filling)

The algorithms and cell properties are described in http://people.csail.mit.edu/jaffer/Geometry/RMDSFF.pdf

Function: make-cell type rank side precession
Function: make-cell type rank side
Function: make-cell type rank

type must be the symbol diagonal, adjacent, or centered. rank must be an integer larger than 1. side, if present, must be an even integer larger than 1 if type is adjacent or an odd integer larger than 2 otherwise; side defaults to the smallest value. precession, if present, must be an integer between 0 and side^rank-1; it is relevant only when type is diagonal or centered.

Function: make-cell Hamiltonian-path-vector precession
Function: make-cell Hamiltonian-path-vector

type must be a vector of side^rank lists of rank of integers encoding the coordinate positions of a Hamiltonian path on the rank-dimensional grid of points starting and ending on corners of the grid. The starting corner must be the origin (all-zero coordinates). If the side-length is even, then the ending corner must be non-zero in only one coordinate; otherwise, the ending corner must be the furthest diagonally opposite corner from the origin.

make-cell returns a data object suitable for passing as the first argument to integer->coordinates or coordinates->integer.

Hilbert, Peano, and centered Peano cells are generated respectively by:

(make-cell 'adjacent rank 2)   ; Hilbert
(make-cell 'diagonal rank 3)   ; Peano
(make-cell 'centered rank 3)   ; centered Peano

In the conversion procedures, if the cell is diagonal or adjacent, then the coordinates and scalar must be nonnegative integers. If centered, then the integers can be negative.

Function: integer->coordinates cell u

integer->coordinates converts the integer u to a list of coordinates according to cell.

Function: coordinates->integer cell v

coordinates->integer converts the list of coordinates v to an integer according to cell.

coordinates->integer and integer->coordinates are inverse functions when passed the same cell argument.


7.2.7.2 Hilbert Space-Filling Curve

(require 'hilbert-fill)

The Hilbert Space-Filling Curve is a one-to-one mapping between a unit line segment and an n-dimensional unit cube. This implementation treats the nonnegative integers either as fractional bits of a given width or as nonnegative integers.

The integer procedures map the non-negative integers to an arbitrarily large n-dimensional cube with its corner at the origin and all coordinates are non-negative.

For any exact nonnegative integer scalar and exact integer rank > 2,

(= scalar (hilbert-coordinates->integer
           (integer->hilbert-coordinates scalar rank)))
                                       ⇒ #t

When treating integers as k fractional bits,

(= scalar (hilbert-coordinates->integer
           (integer->hilbert-coordinates scalar rank k)) k)
                                       ⇒ #t
Function: integer->hilbert-coordinates scalar rank

Returns a list of rank integer coordinates corresponding to exact non-negative integer scalar. The lists returned by integer->hilbert-coordinates for scalar arguments 0 and 1 will differ in the first element.

Function: integer->hilbert-coordinates scalar rank k

scalar must be a nonnegative integer of no more than rank*k bits.

integer->hilbert-coordinates Returns a list of rank k-bit nonnegative integer coordinates corresponding to exact non-negative integer scalar. The curves generated by integer->hilbert-coordinates have the same alignment independent of k.

Function: hilbert-coordinates->integer coords
Function: hilbert-coordinates->integer coords k

Returns an exact non-negative integer corresponding to coords, a list of non-negative integer coordinates.

7.2.7.3 Gray code

A Gray code is an ordering of non-negative integers in which exactly one bit differs between each pair of successive elements. There are multiple Gray codings. An n-bit Gray code corresponds to a Hamiltonian cycle on an n-dimensional hypercube.

Gray codes find use communicating incrementally changing values between asynchronous agents. De-laminated Gray codes comprise the coordinates of Hilbert space-filling curves.

Function: integer->gray-code k

Converts k to a Gray code of the same integer-length as k.

Function: gray-code->integer k

Converts the Gray code k to an integer of the same integer-length as k.

For any non-negative integer k,

(eqv? k (gray-code->integer (integer->gray-code k)))
Function: = k1 k2
Function: gray-code<? k1 k2
Function: gray-code>? k1 k2
Function: gray-code<=? k1 k2
Function: gray-code>=? k1 k2

These procedures return #t if their Gray code arguments are (respectively): equal, monotonically increasing, monotonically decreasing, monotonically nondecreasing, or monotonically nonincreasing.

For any non-negative integers k1 and k2, the Gray code predicate of (integer->gray-code k1) and (integer->gray-code k2) will return the same value as the corresponding predicate of k1 and k2.

7.2.7.4 Bitwise Lamination

Function: delaminate-list count ks

Returns a list of count integers comprised of the jth bit of the integers ks where j ranges from count-1 to 0.

(map (lambda (k) (number->string k 2))
     (delaminate-list 4 '(7 6 5 4 0 0 0 0)))
    ⇒ ("0" "11110000" "11000000" "10100000")

delaminate-list is its own inverse:

(delaminate-list 8 (delaminate-list 4 '(7 6 5 4 0 0 0 0)))
    ⇒ (7 6 5 4 0 0 0 0)

7.2.7.5 Peano Space-Filling Curve

(require 'peano-fill)

Function: natural->peano-coordinates scalar rank

Returns a list of rank nonnegative integer coordinates corresponding to exact nonnegative integer scalar. The lists returned by natural->peano-coordinates for scalar arguments 0 and 1 will differ in the first element.

Function: peano-coordinates->natural coords

Returns an exact nonnegative integer corresponding to coords, a list of nonnegative integer coordinates.

Function: integer->peano-coordinates scalar rank

Returns a list of rank integer coordinates corresponding to exact integer scalar. The lists returned by integer->peano-coordinates for scalar arguments 0 and 1 will differ in the first element.

Function: peano-coordinates->integer coords

Returns an exact integer corresponding to coords, a list of integer coordinates.


7.2.7.6 Sierpinski Curve

(require 'sierpinski)

Function: make-sierpinski-indexer max-coordinate

Returns a procedure (eg hash-function) of 2 numeric arguments which preserves nearness in its mapping from NxN to N.

max-coordinate is the maximum coordinate (a positive integer) of a population of points. The returned procedures is a function that takes the x and y coordinates of a point, (non-negative integers) and returns an integer corresponding to the relative position of that point along a Sierpinski curve. (You can think of this as computing a (pseudo-) inverse of the Sierpinski spacefilling curve.)

Example use: Make an indexer (hash-function) for integer points lying in square of integer grid points [0,99]x[0,99]:

(define space-key (make-sierpinski-indexer 100))

Now let’s compute the index of some points:

(space-key 24 78)               ⇒ 9206
(space-key 23 80)               ⇒ 9172

Note that locations (24, 78) and (23, 80) are near in index and therefore, because the Sierpinski spacefilling curve is continuous, we know they must also be near in the plane. Nearness in the plane does not, however, necessarily correspond to nearness in index, although it tends to be so.

Example applications:

  • Sort points by Sierpinski index to get heuristic solution to travelling salesman problem. For details of performance, see L. Platzman and J. Bartholdi, "Spacefilling curves and the Euclidean travelling salesman problem", JACM 36(4):719–737 (October 1989) and references therein.
  • Use Sierpinski index as key by which to store 2-dimensional data in a 1-dimensional data structure (such as a table). Then locations that are near each other in 2-d space will tend to be near each other in 1-d data structure; and locations that are near in 1-d data structure will be near in 2-d space. This can significantly speed retrieval from secondary storage because contiguous regions in the plane will tend to correspond to contiguous regions in secondary storage. (This is a standard technique for managing CAD/CAM or geographic data.)

7.2.8 Soundex

(require 'soundex)

Function: soundex name

Computes the soundex hash of name. Returns a string of an initial letter and up to three digits between 0 and 6. Soundex supposedly has the property that names that sound similar in normal English pronunciation tend to map to the same key.

Soundex was a classic algorithm used for manual filing of personal records before the advent of computers. It performs adequately for English names but has trouble with other languages.

See Knuth, Vol. 3 Sorting and searching, pp 391–2

To manage unusual inputs, soundex omits all non-alphabetic characters. Consequently, in this implementation:

(soundex <string of blanks>)    ⇒ ""
(soundex "")                    ⇒ ""

Examples from Knuth:

(map soundex '("Euler" "Gauss" "Hilbert" "Knuth"
                       "Lloyd" "Lukasiewicz"))
        ⇒ ("E460" "G200" "H416" "K530" "L300" "L222")

(map soundex '("Ellery" "Ghosh" "Heilbronn" "Kant"
                        "Ladd" "Lissajous"))
        ⇒ ("E460" "G200" "H416" "K530" "L300" "L222")

Some cases in which the algorithm fails (Knuth):

(map soundex '("Rogers" "Rodgers"))     ⇒ ("R262" "R326")

(map soundex '("Sinclair" "St. Clair")) ⇒ ("S524" "S324")

(map soundex '("Tchebysheff" "Chebyshev")) ⇒ ("T212" "C121")

7.2.10 Sequence Comparison

(require 'diff)

diff:edit-length implements the algorithm:

The values returned by diff:edit-length can be used to gauge the degree of match between two sequences.

diff:edits and diff:longest-common-subsequence combine the algorithm with the divide-and-conquer method outlined in:

If the items being sequenced are text lines, then the computed edit-list is equivalent to the output of the diff utility program. If the items being sequenced are words, then it is like the lesser known spiff program.

Function: diff:longest-common-subsequence array1 array2 p-lim
Function: diff:longest-common-subsequence array1 array2

array1 and array2 are one-dimensional arrays.

The non-negative integer p-lim, if provided, is maximum number of deletions of the shorter sequence to allow. diff:longest-common-subsequence will return #f if more deletions would be necessary.

diff:longest-common-subsequence returns a one-dimensional array of length (quotient (- (+ len1 len2) (diff:edit-length array1 array2)) 2) holding the longest sequence common to both arrays.

Function: diff:edits array1 array2 p-lim
Function: diff:edits array1 array2

array1 and array2 are one-dimensional arrays.

The non-negative integer p-lim, if provided, is maximum number of deletions of the shorter sequence to allow. diff:edits will return #f if more deletions would be necessary.

diff:edits returns a vector of length (diff:edit-length array1 array2) composed of a shortest sequence of edits transformaing array1 to array2.

Each edit is an integer:

k > 0

Inserts (array-ref array1 (+ -1 j)) into the sequence.

k < 0

Deletes (array-ref array2 (- -1 k)) from the sequence.

Function: diff:edit-length array1 array2 p-lim
Function: diff:edit-length array1 array2

array1 and array2 are one-dimensional arrays.

The non-negative integer p-lim, if provided, is maximum number of deletions of the shorter sequence to allow. diff:edit-length will return #f if more deletions would be necessary.

diff:edit-length returns the length of the shortest sequence of edits transformaing array1 to array2.

(diff:longest-common-subsequence "fghiejcklm" "fgehijkpqrlm")
⇒ "fghijklm"

(diff:edit-length "fghiejcklm" "fgehijkpqrlm")
⇒ 6

(diff:edits "fghiejcklm" "fgehijkpqrlm")
⇒ #A:fixZ32b(3 -5 -7 8 9 10)
       ; e  c  h p q  r

Next: , Previous: , Up: Other Packages   [Contents][Index]