Previous: Continuations, Up: Operations


6.2.14 Evaluation

SCM uses its type representations to speed evaluation. All of the subr types (see Subr Cells) are tc7 types. Since the tc7 field is in the low order bit position of the CAR it can be retrieved and dispatched on quickly by dereferencing the SCM pointer pointing to it and masking the result.

All the SCM Special Forms get translated to immediate symbols (isym) the first time they are encountered by the interpreter (ceval). The representation of these immediate symbols is engineered to occupy the same bits as tc7. All the isyms occur only in the CAR of lists.

If the CAR of a expression to evaluate is not immediate, then it may be a symbol. If so, the first time it is encountered it will be converted to an immediate type ILOC or GLOC (see Immediates). The codes for ILOC and GLOC lower 7 bits distinguish them from all the other types we have discussed.

Once it has determined that the expression to evaluate is not immediate, ceval need only retrieve and dispatch on the low order 7 bits of the CAR of that cell, regardless of whether that cell is a closure, header, or subr, or a cons containing ILOC or GLOC.

In order to be able to convert a SCM symbol pointer to an immediate ILOC or GLOC, the evaluator must be holding the pointer to the list in which that symbol pointer occurs. Turning this requirement to an advantage, ceval does not recursively call itself to evaluate symbols in lists; It instead calls the macro EVALCAR. EVALCAR does symbol lookup and memoization for symbols, retrieval of values for ILOCs and GLOCs, returns other immediates, and otherwise recursively calls itself with the CAR of the list.

ceval inlines evaluation (using EVALCAR) of almost all procedure call arguments. When ceval needs to evaluate a list of more than length 3, the procedure eval_args is called. So ceval can be said to have one level lookahead. The avoidance of recursive invocations of ceval for the most common cases (special forms and procedure calls) results in faster execution. The speed of the interpreter is currently limited on most machines by interpreter size, probably having to do with its cache footprint. In order to keep the size down, certain EVALCAR calls which don't need to be fast (because they rarely occur or because they are part of expensive operations) are instead calls to the C function evalcar.

— Variable: symhash

Top level symbol values are stored in the symhash table. symhash is an array of lists of ISYMs and pairs of symbols and values.

— Immediate: ILOC

Whenever a symbol's value is found in the local environment the pointer to the symbol in the code is replaced with an immediate object (ILOC) which specifies how many environment frames down and how far in to go for the value. When this immediate object is subsequently encountered, the value can be retrieved quickly.

ILOCs work up to a maximum depth of 4096 frames or 4096 identifiers in a frame. Radey Shouman added FARLOC to handle cases exceeding these limits. A FARLOC consists of a pair whose CAR is the immediate type IM_FARLOC_CAR or IM_FARLOC_CDR, and whose CDR is a pair of INUMs specifying the frame and distance with a larger range than ILOCs span.

Adding #define TEST_FARLOC to eval.c causes FARLOCs to be generated for all local identifiers; this is useful only for testing memoization.

— Immediate: GLOC

Pointers to symbols not defined in local environments are changed to one plus the value cell address in symhash. This incremented pointer is called a GLOC. The low order bit is normally reserved for GCmark; But, since references to variables in the code always occur in the CAR position and the GCmark is in the CDR, there is no conflict.

If the compile FLAG CAUTIOUS is #defined then the number of arguments is always checked for application of closures. If the compile FLAG RECKLESS is #defined then they are not checked. Otherwise, number of argument checks for closures are made only when the function position (whose value is the closure) of a combination is not an ILOC or GLOC. When the function position of a combination is a symbol it will be checked only the first time it is evaluated because it will then be replaced with an ILOC or GLOC.

— Macro: EVAL expression env
— Macro: SIDEVAL expression env

EVAL Returns the result of evaluating expression in env. SIDEVAL evaluates expression in env when the value of the expression is not used.

Both of these macros alter the list structure of expression as it is memoized and hence should be used only when it is known that expression will not be referenced again. The C function eval is safe from this problem.

— Function: SCM eval (SCM expression)

Returns the result of evaluating expression in the top-level environment. eval copies expression so that memoization does not modify expression.