Advanced Operations on Weight-Balanced Trees (slib)

6.3.3 Advanced Operations on Weight-Balanced Trees

In the following the size of a tree is the number of associations that the tree contains, and a smaller tree contains fewer associations.

procedure+: wt-tree/split< wt-tree bound: Returns a new tree containing all and only the associations in wt-tree which have a key that is less than bound in the ordering relation of the tree type of wt-tree. The average and worst-case times required by this operation are proportional to the logarithm of the size of wt-tree.

procedure+: wt-tree/split> wt-tree bound: Returns a new tree containing all and only the associations in wt-tree which have a key that is greater than bound in the ordering relation of the tree type of wt-tree. The average and worst-case times required by this operation are proportional to the logarithm of size of wt-tree.

procedure+: wt-tree/union wt-tree-1 wt-tree-2: Returns a new tree containing all the associations from both trees. This operation is asymmetric: when both trees have an association for the same key, the returned tree associates the datum from wt-tree-2 with the key. Thus if the trees are viewed as discrete maps then wt-tree/union computes the map override of wt-tree-1 by wt-tree-2. If the trees are viewed as sets the result is the set union of the arguments. The worst-case time required by this operation is proportional to the sum of the sizes of both trees. If the minimum key of one tree is greater than the maximum key of the other tree then the time required is at worst proportional to the logarithm of the size of the larger tree.

procedure+: wt-tree/intersection wt-tree-1 wt-tree-2: Returns a new tree containing all and only those associations from wt-tree-1 which have keys appearing as the key of an association in wt-tree-2. Thus the associated data in the result are those from wt-tree-1. If the trees are being used as sets the result is the set intersection of the arguments. As a discrete map operation, wt-tree/intersection computes the domain restriction of wt-tree-1 to (the domain of) wt-tree-2. The time required by this operation is never worse that proportional to the sum of the sizes of the trees.

procedure+: wt-tree/difference wt-tree-1 wt-tree-2: Returns a new tree containing all and only those associations from wt-tree-1 which have keys that do not appear as the key of an association in wt-tree-2. If the trees are viewed as sets the result is the asymmetric set difference of the arguments. As a discrete map operation, it computes the domain restriction of wt-tree-1 to the complement of (the domain of) wt-tree-2. The time required by this operation is never worse that proportional to the sum of the sizes of the trees.

procedure+: wt-tree/subset? wt-tree-1 wt-tree-2

Returns #t iff the key of each association in wt-tree-1 is the key of some association in wt-tree-2, otherwise returns #f. Viewed as a set operation, wt-tree/subset? is the improper subset predicate. A proper subset predicate can be constructed:

(define (proper-subset? s1 s2)
  (and (wt-tree/subset? s1 s2)
       (< (wt-tree/size s1) (wt-tree/size s2))))

As a discrete map operation, wt-tree/subset? is the subset test on the domain(s) of the map(s). In the worst-case the time required by this operation is proportional to the size of wt-tree-1.

procedure+: wt-tree/set-equal? wt-tree-1 wt-tree-2

Returns #t iff for every association in wt-tree-1 there is an association in wt-tree-2 that has the same key, and vice versa.

Viewing the arguments as sets wt-tree/set-equal? is the set equality predicate. As a map operation it determines if two maps are defined on the same domain.

This procedure is equivalent to

(lambda (wt-tree-1 wt-tree-2)
  (and (wt-tree/subset? wt-tree-1 wt-tree-2
       (wt-tree/subset? wt-tree-2 wt-tree-1)))

In the worst-case the time required by this operation is proportional to the size of the smaller tree.

procedure+: wt-tree/fold combiner initial wt-tree

This procedure reduces wt-tree by combining all the associations, using an reverse in-order traversal, so the associations are visited in reverse order. Combiner is a procedure of three arguments: a key, a datum and the accumulated result so far. Provided combiner takes time bounded by a constant, wt-tree/fold takes time proportional to the size of wt-tree.

A sorted association list can be derived simply:

(wt-tree/fold  (lambda (key datum list)
                 (cons (cons key datum) list))
               '()
               wt-tree))

The data in the associations can be summed like this:

(wt-tree/fold  (lambda (key datum sum) (+ sum datum))
               0
               wt-tree)

procedure+: wt-tree/for-each action wt-tree

This procedure traverses the tree in-order, applying action to each association. The associations are processed in increasing order of their keys. Action is a procedure of two arguments which take the key and datum respectively of the association. Provided action takes time bounded by a constant, wt-tree/for-each takes time proportional to in the size of wt-tree. The example prints the tree:

(wt-tree/for-each (lambda (key value)
                    (display (list key value)))
                  wt-tree))

procedure+: wt-tree/union-merge wt-tree-1 wt-tree-2 merge

Returns a new tree containing all the associations from both trees. If both trees have an association for the same key, the datum associated with that key in the result tree is computed by applying the procedure merge to the key, the value from wt-tree-1 and the value from wt-tree-2. Merge is of the form

(lambda (key datum-1 datum-2) ...)

If some key occurs only in one tree, that association will appear in the result tree without being processed by merge, so for this operation to make sense, either merge must have both a right and left identity that correspond to the association being absent in one of the trees, or some guarantee must be made, for example, all the keys in one tree are known to occur in the other.

These are all reasonable procedures for merge

(lambda (key val1 val2) (+ val1 val2))
(lambda (key val1 val2) (append val1 val2))
(lambda (key val1 val2) (wt-tree/union val1 val2))

However, a procedure like

(lambda (key val1 val2) (- val1 val2))

would result in a subtraction of the data for all associations with keys occuring in both trees but associations with keys occuring in only the second tree would be copied, not negated, as is presumably be intent. The programmer might ensure that this never happens.

This procedure has the same time behavior as wt-tree/union but with a slightly worse constant factor. Indeed, wt-tree/union might have been defined like this:

(define (wt-tree/union tree1 tree2)
  (wt-tree/union-merge tree1 tree2
                       (lambda (key val1 val2) val2)))

The merge procedure takes the key as a parameter in case the data are not independent of the key.