Next: Program Self-Knowledge, Previous: Data Types, Up: The Implementation [Contents][Index]
Next: Memory Management for Environments, Previous: Operations, Up: Operations [Contents][Index]
The garbage collector is in the latter half of sys.c. The primary goal of garbage collection (or GC) is to recycle those cells no longer in use. Immediates always appear as parts of other objects, so they are not subject to explicit garbage collection.
All cells reside in the heap (composed of heap segments). Note that this is different from what Computer Science usually defines as a heap.
Next: Sweeping the Heap, Previous: Garbage Collection, Up: Garbage Collection [Contents][Index]
The first step in garbage collection is to mark all heap objects
in use. Each heap cell has a bit reserved for this purpose. For pairs
(cons cells) the lowest order bit (0) of the CDR is used. For other
types, bit 8 of the CAR is used. The GC bits are never set except
during garbage collection. Special C macros are defined in scm.h
to allow easy manipulation when GC bits are possibly set. CAR
,
TYP3
, and TYP7
can be used on GC marked cells as they are.
Returns the CDR of a cons cell, even if that cell has been GC marked.
Returns the 16 bit type code of a cell.
We need to (recursively) mark only a few objects in order to assure that
all accessible objects are marked. Those objects are
sys_protects[]
(for example, dynwinds
), the current
C-stack and the hash table for symbols, symhash.
The function gc_mark()
is used for marking SCM cells. If
obj is marked, gc_mark()
returns. If obj is
unmarked, gc_mark sets the mark bit in obj, then calls
gc_mark()
on any SCM components of obj. The last call to
gc_mark()
is tail-called (looped).
The function mark_locations
is used for marking segments of
C-stack or saved segments of C-stack (marked continuations). The
argument len is the size of the stack in units of size
(STACKITEM)
.
Each longword in the stack is tried to see if it is a valid cell pointer
into the heap. If it is, the object itself and any objects it points to
are marked using gc_mark
. If the stack is word rather than
longword aligned (#define WORD_ALIGN)
, both alignments are tried.
This arrangement will occasionally mark an object which is no longer
used. This has not been a problem in practice and the advantage of
using the c-stack far outweighs it.
Previous: Marking Cells, Up: Garbage Collection [Contents][Index]
After all found objects have been marked, the heap is swept.
The storage for strings, vectors, continuations, doubles, complexes, and bignums is managed by malloc. There is only one pointer to each malloc object from its type-header cell in the heap. This allows malloc objects to be freed when the associated heap object is garbage collected.
The function gc_sweep
scans through all heap segments. The mark
bit is cleared from marked cells. Unmarked cells are spliced into
freelist, where they can again be returned by invocations of
NEWCELL
.
If a type-header cell pointing to malloc space is unmarked, the malloc
object is freed. If the type header of smob is collected, the smob’s
free
procedure is called to free its storage.
Next: Dynamic Linking Support, Previous: Garbage Collection, Up: Operations [Contents][Index]
The memory management component of SCM contains special features which optimize the allocation and garbage collection of environments.
The optimizations are based on certain facts and assumptions:
The SCM evaluator creates many environments with short lifetimes and these account of a large portion of the total number of objects allocated.
The general purpose allocator allocates objects from a freelist, and collects using a mark/sweep algorithm. Research into garbage collection suggests that such an allocator is sub-optimal for object populations containing a large portion of short-lived members and that allocation strategies involving a copying collector are more appropriate.
It is a property of SCM, reflected throughout the source code, that a simple copying collector can not be used as the general purpose memory manager: much code assumes that the run-time stack can be treated as a garbage collection root set using conservative garbage collection techniques, which are incompatible with objects that change location.
Nevertheless, it is possible to use a mostly-separate copying-collector, just for environments. Roughly speaking, cons pairs making up environments are initially allocated from a small heap that is collected by a precise copying collector. These objects must be handled specially for the collector to work. The (presumably) small number of these objects that survive one collection of the copying heap are copied to the general purpose heap, where they will later be collected by the mark/sweep collector. The remaining pairs are more rapidly collected than they would otherwise be and all of this collection is accomplished without having to mark or sweep any other segment of the heap.
Allocating cons pairs for environments from this special heap is a heuristic that approximates the (unachievable) goal:
allocate all short-lived objects from the copying-heap, at no extra cost in allocation time.
A separate heap (ecache_v
) is maintained for the copying
collector. Pairs are allocated from this heap in a stack-like fashion.
Objects in this heap may be protected from garbage collection by:
scm_estk
) is used in place of the C
run-time stack by the SCM evaluator to hold local variables which refer
to the copying heap.
scm_egc_roots
). If no object in the mark/sweep
heap directly references an object from the copying heap, that object
can be preserved by storing a direct reference to it in the
copying-collector root set.
When the copying heap or root-set becomes full, the copying collector is invoked. All protected objects are copied to the mark-sweep heap. All references to those objects are updated. The copying collector root-set and heap are emptied.
References to pairs allocated specificly for environments are inaccessible to the Scheme procedures evaluated by SCM. These pairs are manipulated by only a small number of code fragments in the interpreter. To support copying collection, those code fragments (mostly in eval.c) have been modified to protect environments from garbage collection using the three rules listed above.
During a mark-sweep collection, the copying collector heap is marked and swept almost like any ordinary segment of the general purpose heap. The only difference is that pairs from the copying heap that become free during a sweep phase are not added to the freelist.
The environment cache is disabled by adding #define NO_ENV_CACHE
to eval.c; all environment cells are then allocated from the
regular heap.
This work seems to build upon a considerable amount of previous work into garbage collection techniques about which a considerable amount of literature is available.
Next: Configure Module Catalog, Previous: Memory Management for Environments, Up: Operations [Contents][Index]
Dynamic linking has not been ported to all platforms. Operating systems
in the BSD family (a.out binary format) can usually be ported to
DLD. The dl library (#define SUN_DL
for SCM) was a
proposed POSIX standard and may be available on other machines with
COFF binary format. For notes about porting to MS-Windows and
finishing the port to VMS VMS Dynamic Linking.
DLD is a library package of C functions that performs dynamic link editing on GNU/Linux, VAX (Ultrix), Sun 3 (SunOS 3.4 and 4.0), SPARCstation (SunOS 4.0), Sequent Symmetry (Dynix), and Atari ST. It is available from:
These notes about using libdl on SunOS are from gcc.info:
On a Sun, linking using GNU CC fails to find a shared library and reports that the library doesn’t exist at all.
This happens if you are using the GNU linker, because it does only static linking and looks only for unshared libraries. If you have a shared library with no unshared counterpart, the GNU linker won’t find anything.
We hope to make a linker which supports Sun shared libraries, but please don’t ask when it will be finished–we don’t know.
Sun forgot to include a static version of libdl.a with some versions of SunOS (mainly 4.1). This results in undefined symbols when linking static binaries (that is, if you use ‘-static’). If you see undefined symbols ‘_dlclose’, ‘_dlsym’ or ‘_dlopen’ when linking, compile and link against the file mit/util/misc/dlsym.c from the MIT version of X windows.
Next: Automatic C Preprocessor Definitions, Previous: Dynamic Linking Support, Up: Operations [Contents][Index]
The SLIB module catalog can be extended to define other
require
-able packages by adding calls to the Scheme source file
mkimpcat.scm. Within mkimpcat.scm, the following
procedures are defined.
feature should be a symbol. object-file should be a string
naming a file containing compiled object-code. Each libn
argument should be either a string naming a library file or #f
.
If object-file exists, the add-link
procedure registers
symbol feature so that the first time require
is called
with the symbol feature as its argument, object-file and the
lib1 … are dynamically linked into the executing SCM
session.
If object-file exists, add-link
returns #t
,
otherwise it returns #f
.
For example, to install a compiled dll foo, add these lines to mkimpcat.scm:
(add-link 'foo (in-vicinity (implementation-vicinity) "foo" link:able-suffix))
alias and feature are symbols. The procedure
add-alias
registers alias as an alias for feature.
An unspecified value is returned.
add-alias
causes (require 'alias)
to behave like
(require 'feature)
.
feature is a symbol. filename is a string naming a file
containing Scheme source code. The procedure add-source
registers feature so that the first time require
is called
with the symbol feature as its argument, the file filename
will be load
ed. An unspecified value is returned.
Remember to delete the file slibcat after modifying the file mkimpcat.scm in order to force SLIB to rebuild its cache.
Next: Signals, Previous: Configure Module Catalog, Up: Operations [Contents][Index]
These ‘#defines’ are automatically provided by preprocessors of
various C compilers. SCM uses the presence or absence of these
definitions to configure include file locations and aliases for
library functions. If the definition(s) corresponding to your system
type is missing as your system is configured, add -Dflag
to
the compilation command lines or add a #define flag
line to
scmfig.h or the beginning of scmfig.h.
#define Platforms: ------- ---------- ARM_ULIB Huw Rogers free unix library for acorn archimedes AZTEC_C Aztec_C 5.2a __CYGWIN__ Cygwin __CYGWIN32__ Cygwin _DCC Dice C on AMIGA __GNUC__ Gnu CC (and DJGPP) __EMX__ Gnu C port (gcc/emx 0.8e) to OS/2 2.0 __HIGHC__ MetaWare High C __IBMC__ C-Set++ on OS/2 2.1 _MSC_VER MS VisualC++ 4.2 MWC Mark Williams C on COHERENT __MWERKS__ Metrowerks Compiler; Macintosh and WIN32 (?) _POSIX_SOURCE ?? _QC Microsoft QuickC __STDC__ ANSI C compliant __TURBOC__ Turbo C and Borland C __USE_POSIX ?? __WATCOMC__ Watcom C on MS-DOS __ZTC__ Zortech C _AIX AIX operating system __APPLE__ Apple Darwin AMIGA SAS/C 5.10 or Dice C on AMIGA __amigaos__ Gnu CC on AMIGA atarist ATARI-ST under Gnu CC __DragonflyBSD__ DragonflyBSD __FreeBSD__ FreeBSD GNUDOS DJGPP (obsolete in version 1.08) __GO32__ DJGPP (future?) hpux HP-UX linux GNU/Linux macintosh Macintosh (THINK_C and __MWERKS__ define) MCH_AMIGA Aztec_c 5.2a on AMIGA __MACH__ Apple Darwin __MINGW32__ MinGW - Minimalist GNU for Windows MSDOS Microsoft C 5.10 and 6.00A _MSDOS Microsoft CLARM and CLTHUMB compilers. __MSDOS__ Turbo C, Borland C, and DJGPP __NetBSD__ NetBSD nosve Control Data NOS/VE __OpenBSD__ OpenBSD SVR2 System V Revision 2. sun SunOS __SVR4 SunOS THINK_C developement environment for the Macintosh ultrix VAX with ULTRIX operating system. unix most Unix and similar systems and DJGPP (!?) __unix__ Gnu CC and DJGPP _UNICOS Cray operating system vaxc VAX C compiler VAXC VAX C compiler vax11c VAX C compiler VAX11 VAX C compiler _Windows Borland C 3.1 compiling for Windows _WIN32 MS VisualC++ 4.2 and Cygwin (Win32 API) _WIN32_WCE MS Windows CE vms (and VMS) VAX-11 C under VMS. __alpha DEC Alpha processor __alpha__ DEC Alpha processor __hppa__ HP RISC processor hp9000s800 HP RISC processor __ia64 GCC on IA64 __ia64__ GCC on IA64 _LONGLONG GCC on IA64 __i386__ DJGPP i386 DJGPP _M_ARM Microsoft CLARM compiler defines as 4 for ARM. _M_ARMT Microsoft CLTHUMB compiler defines as 4 for Thumb. MULTIMAX Encore computer ppc PowerPC __ppc__ PowerPC pyr Pyramid 9810 processor __sgi__ Silicon Graphics Inc. sparc SPARC processor sequent Sequent computer tahoe CCI Tahoe processor vax VAX processor __x86_64 AMD Opteron
Next: C Macros, Previous: Automatic C Preprocessor Definitions, Up: Operations [Contents][Index]
(in scm.c) initializes handlers for SIGINT
and
SIGALRM
if they are supported by the C implementation. All of
the signal handlers immediately reestablish themselves by a call to
signal()
.
The low level handlers for SIGINT
and SIGALRM
.
If an interrupt handler is defined when the interrupt is received, the
code is interpreted. If the code returns, execution resumes from where
the interrupt happened. Call-with-current-continuation
allows
the stack to be saved and restored.
SCM does not use any signal masking system calls. These are not a
portable feature. However, code can run uninterrupted by use of the C
macros DEFER_INTS
and ALLOW_INTS
.
sets the global variable ints_disabled
to 1. If an interrupt
occurs during a time when ints_disabled
is 1, then
deferred_proc
is set to non-zero, one of the global variables
SIGINT_deferred
or SIGALRM_deferred
is set to 1, and the
handler returns.
Checks the deferred variables and if set the appropriate handler is called.
Calls to DEFER_INTS
can not be nested. An ALLOW_INTS
must
happen before another DEFER_INTS
can be done. In order to check
that this constraint is satisfied #define CAREFUL_INTS
in
scmfig.h.
Next: Changing Scm, Previous: Signals, Up: Operations [Contents][Index]
signals an error if the expression (cond) is 0. arg is the offending object, subr is the string naming the subr, and pos indicates the position or type of error. pos can be one of
ARGn
(> 5 or unknown ARG number)
ARG1
ARG2
ARG3
ARG4
ARG5
WNA
(wrong number of args)
OVFLOW
OUTOFRANGE
NALLOC
EXIT
HUP_SIGNAL
INT_SIGNAL
FPE_SIGNAL
BUS_SIGNAL
SEGV_SIGNAL
ALRM_SIGNAL
(char *)
Error checking is not done by ASRTER
if the flag RECKLESS
is defined. An error condition can still be signaled in this case with
a call to wta(arg, pos, subr)
.
goto
label if the expression (cond) is 0. Like
ASRTER
, ASRTGO
does is not active if the flag
RECKLESS
is defined.
Next: Allocating memory, Previous: C Macros, Up: Operations [Contents][Index]
When writing C-code for SCM, a precaution is recommended. If your
routine allocates a non-cons cell which will not be incorporated
into a SCM
object which is returned, you need to make sure that a
SCM
variable in your routine points to that cell as long as part
of it might be referenced by your code.
In order to make sure this SCM
variable does not get optimized
out you can put this assignment after its last possible use:
SCM_dummy1 = foo;
or put this assignment somewhere in your routine:
SCM_dummy1 = (SCM) &foo;
SCM_dummy
variables are not currently defined. Passing the
address of the local SCM
variable to any procedure also
protects it. The procedure scm_protect_temp
is provided for
this purpose.
Forces the SCM object ptr to be saved on the C-stack, where it will be traced for GC.
Also, if you maintain a static pointer to some (non-immediate)
SCM
object, you must either make your pointer be the value cell
of a symbol (see errobj
for an example) or (permanently) add
your pointer to sys_protects
using:
Permanently adds obj to a table of objects protected from
garbage collection. scm_gc_protect
returns obj.
To add a C routine to scm:
make_subr
or make_gsubr
call to init_scm
. Or
put an entry into the appropriate iproc
structure.
To add a package of new procedures to scm (see crs.c for example):
static char s_twiddle_bits[]="twiddle-bits!"; static char s_bitsp[]="bits?";
iproc
structure for each subr type used in foo.c
static iproc subr3s[]= { {s_twiddle-bits,twiddle-bits}, {s_bitsp,bitsp}, {0,0} };
init_<name of file>
routine at the end of the file
which calls init_iprocs
with the correct type for each of the
iproc
s created in step 5.
void init_foo() { init_iprocs(subr1s, tc7_subr_1); init_iprocs(subr3s, tc7_subr_3); }
If your package needs to have a finalization routine called to
free up storage, close files, etc, then also have a line in
init_foo
like:
add_final(final_foo);
final_foo
should be a (void) procedure of no arguments. The
finals will be called in opposite order from their definition.
The line:
add_feature("foo");
will append a symbol 'foo
to the (list) value of
slib:features
.
if
into Init5f4.scm which loads
Ifoo.scm if your package is included:
(if (defined? twiddle-bits!) (load (in-vicinity (implementation-vicinity) "Ifoo" (scheme-file-suffix))))
or use (provided? 'foo)
instead of (defined?
twiddle-bits!)
if you have added the feature.
init_foo\(\)\;
to the INITS=…
line at the beginning of the makefile.
These steps should allow your package to be linked into SCM with a minimum of difficulty. Your package should also work with dynamic linking if your SCM has this capability.
Special forms (new syntax) can be added to scm.
MAKISYM
in scm.h and increment
NUM_ISYMS
.
isymnames
in repl.c.
case
clause to ceval()
near i_quasiquote
(in
eval.c).
New syntax can now be added without recompiling SCM by the use of the
procedure->syntax
, procedure->macro
,
procedure->memoizing-macro
, and defmacro
. For details,
See Syntax.
Next: Embedding SCM, Previous: Changing Scm, Up: Operations [Contents][Index]
SCM maintains a count of bytes allocated using malloc, and calls the
garbage collector when that number exceeds a dynamically managed limit.
In order for this to work properly, malloc
and free
should
not be called directly to manage memory freeable by garbage collection.
The following functions are provided for that purpose:
len is the number of bytes that should be allocated, what is
a string to be used in error or gc messages. must_malloc
returns
a pointer to newly allocated memory. must_malloc_cell
returns a
newly allocated cell whose car
is c and whose cdr
is
a pointer to newly allocated memory.
must_realloc_cell
takes as argument z a cell whose
cdr
should be a pointer to a block of memory of length olen
allocated with must_malloc_cell
and modifies the cdr
to point
to a block of memory of length len. must_realloc
takes as
argument where the address of a block of memory of length olen
allocated by must_malloc
and returns the address of a block of
length len.
The contents of the reallocated block will be unchanged up to the minimum of the old and new sizes.
what is a pointer to a string used for error and gc messages.
must_malloc
, must_malloc_cell
, must_realloc
, and
must_realloc_cell
must be called with interrupts deferred
See Signals. must_realloc
and must_realloc_cell
must
not be called during initialization (non-zero errjmp_bad) – the initial
allocations must be large enough.
must_free
is used to free a block of memory allocated by the
above functions and pointed to by ptr. len is the length of
the block in bytes, but this value is used only for debugging purposes.
If it is difficult or expensive to calculate then zero may be used
instead.
Next: Callbacks, Previous: Allocating memory, Up: Operations [Contents][Index]
The file scmmain.c contains the definition of main(). When SCM is compiled as a library scmmain.c is not included in the library; a copy of scmmain.c can be modified to use SCM as an embedded library module.
This is the top level C routine. The value of the argc argument
is the number of command line arguments. The argv argument is a
vector of C strings; its elements are the individual command line
argument strings. A null pointer always follows the last element:
argv[argc]
is this null pointer.
This string is the pathname of the executable file being run. This variable can be examined and set from Scheme (see Internal State). execpath must be set to executable’s path in order to use DUMP (see Dump) or DLD.
Rename main() and arrange your code to call it with an argv which sets up SCM as you want it.
If you need more control than is possible through argv, here are descriptions of the functions which main() calls.
Call this before SCM calls malloc(). Value returned from sbrk() is used to gauge how much storage SCM uses.
argc and argv are as described in main(). script_arg
is the pathname of the SCSH-style script (see Scripting) being
invoked; 0 otherwise. scm_find_execpath
returns the pathname of
the executable being run; if scm_find_execpath
cannot determine
the pathname, then it returns 0.
scm_find_implpath
is defined in scmmain.c. Preceeding
this are definitions ofGENERIC_NAME and INIT_GETENV. These,
along with IMPLINIT and dirsep control scm_find_implpath()’s
operation.
If your application has an easier way to locate initialization code for
SCM, then you can replace scm_find_implpath
.
Returns the full pathname of the Scheme initialization file or 0 if it cannot find it.
The string value of the preprocessor variable INIT_GETENV names an
environment variable (default ‘"SCM_INIT_PATH"’). If this
environment variable is defined, its value will be returned from
scm_find_implpath
. Otherwise find_impl_file() is called with the
arguments execpath, GENERIC_NAME (default "scm"),
INIT_FILE_NAME (default "Init5f4_scm"), and the
directory separator string dirsep. If find_impl_file() returns 0
and IMPLINIT is defined, then a copy of the string IMPLINIT
is returned.
Tries to determine whether inport (usually stdin) is an interactive input port which should be used in an unbuffered mode. If so, inport is set to unbuffered and non-zero is returned. Otherwise, 0 is returned.
init_buf0
should be called before any input is read from
inport. Its value can be used as the last argument to
scm_init_from_argv().
Initializes SCM storage and creates a list of the argument strings program-arguments from argv. argc and argv must already be processed to accomodate Scheme Scripts (if desired). The scheme variable *script* is set to the string script_arg, or #f if script_arg is 0. iverbose is the initial prolixity level. If buf0stdin is non-zero, stdin is treated as an unbuffered port.
Call init_signals
and restore_signals
only if you want SCM
to handle interrupts and signals.
Initializes handlers for SIGINT
and SIGALRM
if they are
supported by the C implementation. All of the signal handlers
immediately reestablish themselves by a call to signal()
.
Restores the handlers in effect when init_signals
was called.
This is SCM’s top-level. Errors longjmp here. toplvl_fun is a
callback function of zero arguments that is called by
scm_top_level
to do useful work – if zero, then repl
,
which implements a read-eval-print loop, is called.
If toplvl_fun returns, then scm_top_level
will return as
well. If the return value of toplvl_fun is an immediate integer
then it will be used as the return value of scm_top_level
. In
the main function supplied with SCM, this return value is the exit
status of the process.
If the first character of string initpath is ‘;’, ‘(’ or whitespace, then scm_ldstr() is called with initpath to initialize SCM; otherwise initpath names a file of Scheme code to be loaded to initialize SCM.
When a Scheme error is signaled; control will pass into
scm_top_level
by longjmp
, error messages will be printed
to current-error-port
, and then toplvl_fun will be called
again. toplvl_fun must maintain enough state to prevent errors
from being resignalled. If toplvl_fun
can not recover from an
error situation it may simply return.
Calls all finalization routines registered with add_final(). If freeall is non-zero, then all memory which SCM allocated with malloc() will be freed.
You can call indivdual Scheme procedures from C code in the
toplvl_fun argument passed to scm_top_level(), or from module
subrs (registered by an init_
function, see Changing Scm).
Use apply
to call Scheme procedures from your C code. For
example:
/* If this apply fails, SCM will catch the error */ apply(CDR(intern("srv:startup",sizeof("srv:startup")-1)), mksproc(srvproc), listofnull); func = CDR(intern(rpcname,strlen(rpcname))); retval = apply(func, cons(mksproc(srvproc), args), EOL);
Functions for loading Scheme files and evaluating Scheme code given as C strings are described in the next section, (see Callbacks).
Here is a minimal embedding program libtest.c:
/* gcc -o libtest libtest.c libscm.a -ldl -lm -lc */ #include "scm.h" /* include patchlvl.h for SCM's INIT_FILE_NAME. */ #include "patchlvl.h" void libtest_init_user_scm() { fputs("This is libtest_init_user_scm\n", stderr); fflush(stderr); sysintern("*the-string*", makfrom0str("hello world\n")); } SCM user_main() { static int done = 0; if (done++) return MAKINUM(EXIT_FAILURE); scm_ldstr("(display *the-string*)"); return MAKINUM(EXIT_SUCCESS); } int main(argc, argv) int argc; const char **argv; { SCM retval; char *implpath, *execpath; init_user_scm = libtest_init_user_scm; execpath = dld_find_executable(argv[0]); fprintf(stderr, "dld_find_executable(%s): %s\n", argv[0], execpath); implpath = find_impl_file(execpath, "scm", INIT_FILE_NAME, dirsep); fprintf(stderr, "implpath: %s\n", implpath); scm_init_from_argv(argc, argv, 0L, 0, 0); retval = scm_top_level(implpath, user_main); final_scm(!0); return (int)INUM(retval); } -| dld_find_executable(./libtest): /home/jaffer/scm/libtest implpath: /home/jaffer/scm/Init5f4.scm This is libtest_init_user_scm hello world
Next: Type Conversions, Previous: Embedding SCM, Up: Operations [Contents][Index]
SCM now has routines to make calling back to Scheme procedures easier. The source code for these routines are found in rope.c.
Loads the Scheme source file file. Returns 0 if successful, non-0 if not. This function is used to load SCM’s initialization file Init5f4.scm.
Loads the Scheme source file (in-vicinity (program-vicinity)
file)
. Returns 0 if successful, non-0 if not.
This function is useful for compiled code init_ functions to load
non-compiled Scheme (source) files. program-vicinity
is the
directory from which the calling code was loaded
(see Vicinity in SLIB).
Returns the result of reading an expression from str and evaluating it.
Reads and evaluates all the expressions from str.
If you wish to catch errors during execution of Scheme code, then you can use a wrapper like this for your Scheme procedures:
(define (srv:protect proc) (lambda args (define result #f) ; put default value here (call-with-current-continuation (lambda (cont) (dynamic-wind (lambda () #t) (lambda () (set! result (apply proc args)) (set! cont #f)) (lambda () (if cont (cont #f)))))) result))
Calls to procedures so wrapped will return even if an error occurs.
Next: Continuations, Previous: Callbacks, Up: Operations [Contents][Index]
These type conversion functions are very useful for connecting SCM and C code. Most are defined in rope.c.
Return an object of type SCM
corresponding to the long
or
unsigned long
argument n. If n cannot be converted,
BOOL_F
is returned. Which numbers can be converted depends on
whether SCM was compiled with the BIGDIG
or FLOATS
flags.
To convert integer numbers of smaller types (short
or
char
), use the macro MAKINUM(n)
.
These functions are used to check and convert SCM
arguments to
the named C type. The first argument num is checked to see it it
is within the range of the destination type. If so, the converted
number is returned. If not, the ASRTER
macro calls wta
with num and strings pos and s_caller. For a listing
of useful predefined pos macros, See C Macros.
Note Inexact numbers are accepted only by num2dbl
,
num2long
, and num2ulong
(for when SCM
is compiled
without bignums). To convert inexact numbers to exact numbers,
See inexact->exact in Revised(5) Scheme.
Returns a pointer (cast to an unsigned long
) to the storage
corresponding to the location accessed by
aref(CAR(args),CDR(args))
. The string s_name is used in
any messages from error calls by scm_addr
.
scm_addr
is useful for performing C operations on strings or
other uniform arrays (see Uniform Array).
Returns a pointer (cast to an unsigned long
) to the beginning
of storage of array ra. Note that if ra is a
shared-array, the strorage accessed this way may be much larger than
ra.
Note While you use a pointer returned from scm_addr
or
scm_base_addr
you must keep a pointer to the associated
SCM
object in a stack allocated variable or GC-protected
location in order to assure that SCM does not reuse that storage
before you are done with it. See scm_gc_protect.
Return a newly allocated string SCM
object copy of the
null-terminated string src or the string src of length
len, respectively.
Returns a newly allocated SCM
list of strings corresponding to
the argc length array of null-terminated strings argv. If
argv is less than 0
, argv is assumed to be
NULL
terminated. makfromstrs
is used by
scm_init_from_argv
to convert the arguments SCM was called with
to a SCM
list which is the value of SCM procedure calls to
program-arguments
(see program-arguments).
Returns a NULL
terminated list of null-terminated strings copied
from the SCM
list of strings args. The string s_name
is used in messages from error calls by makargvfrmstrs
.
makargvfrmstrs
is useful for constructing argument lists suitable
for passing to main
functions.
Frees the storage allocated to create argv by a call to
makargvfrmstrs
.
Next: Evaluation, Previous: Type Conversions, Up: Operations [Contents][Index]
The source files continue.h and continue.c are designed to function as an independent resource for programs wishing to use continuations, but without all the rest of the SCM machinery. The concept of continuations is explained in call-with-current-continuation in Revised(5) Scheme.
The C constructs jmp_buf
, setjmp
, and longjmp
implement escape continuations. On VAX and Cray platforms, the setjmp
provided does not save all the registers. The source files
setjump.mar, setjump.s, and ugsetjump.s provide
implementations which do meet this criteria.
SCM uses the names jump_buf
, setjump
, and longjump
in lieu of jmp_buf
, setjmp
, and longjmp
to prevent
name and declaration conflicts.
is a typedef
ed structure holding all the information needed to
represent a continuation. The other slot can be used to hold any
data the user wishes to put there by defining the macro
CONTINUATION_OTHER
.
If SHORT_ALIGN
is #define
d (in scmfig.h), then the
it is assumed that pointers in the stack can be aligned on short
int
boundaries.
is a pointer to objects of the size specified by SHORT_ALIGN
being #define
d or not.
If CHEAP_CONTINUATIONS
is #define
d (in scmfig.h)
each CONTINUATION
has size sizeof CONTINUATION
.
Otherwise, all but root CONTINUATION
s have additional
storage (immediately following) to contain a copy of part of the stack.
Note On systems with nonlinear stack disciplines (multiple
stacks or non-contiguous stack frames) copying the stack will not work
properly. These systems need to #define CHEAP_CONTINUATIONS
in
scmfig.h.
Expresses which way the stack grows by its being #define
d or not.
Gets set to the value passed to throw_to_continuation
.
Returns the number of units of size STACKITEM
which fit between
start and the current top of stack. No check is done in this
routine to ensure that start is actually in the current stack
segment.
Allocates (malloc
) storage for a CONTINUATION
of the
current extent of stack. This newly allocated CONTINUATION
is
returned if successful, 0
if not. After
make_root_continuation
returns, the calling routine still needs
to setjump(new_continuation->jmpbuf)
in order to complete
the capture of this continuation.
Allocates storage for the current CONTINUATION
, copying (or
encapsulating) the stack state from parent_cont->stkbse
to
the current top of stack. The newly allocated CONTINUATION
is
returned if successful, 0
q if not. After
make_continuation
returns, the calling routine still needs to
setjump(new_continuation->jmpbuf)
in order to complete the
capture of this continuation.
Frees the storage pointed to by cont. Remember to free storage
pointed to by cont->other
.
Sets thrown_value
to value and returns from the
continuation cont.
If CHEAP_CONTINUATIONS
is #define
d, then
throw_to_continuation
does longjump(cont->jmpbuf, val)
.
If CHEAP_CONTINUATIONS
is not #define
d, the CONTINUATION
cont contains a copy of a portion of the C stack (whose bound must
be CONT(root_cont)->stkbse
). Then:
longjump(cont->jmpbuf, val)
;
Previous: Continuations, Up: Operations [Contents][Index]
SCM uses its type representations to speed evaluation. All of the
subr
types (see Subr Cells) are tc7
types. Since the
tc7
field is in the low order bit position of the CAR
it
can be retrieved and dispatched on quickly by dereferencing the SCM
pointer pointing to it and masking the result.
All the SCM Special Forms get translated to immediate symbols
(isym
) the first time they are encountered by the interpreter
(ceval
). The representation of these immediate symbols is
engineered to occupy the same bits as tc7
. All the isym
s
occur only in the CAR
of lists.
If the CAR
of a expression to evaluate is not immediate, then it
may be a symbol. If so, the first time it is encountered it will be
converted to an immediate type ILOC
or GLOC
(see Immediates). The codes for ILOC
and GLOC
lower 7
bits distinguish them from all the other types we have discussed.
Once it has determined that the expression to evaluate is not immediate,
ceval
need only retrieve and dispatch on the low order 7 bits of
the CAR
of that cell, regardless of whether that cell is a
closure, header, or subr, or a cons containing ILOC
or
GLOC
.
In order to be able to convert a SCM symbol pointer to an immediate ILOC
or GLOC
, the evaluator must be holding the pointer to the list in which
that symbol pointer occurs. Turning this requirement to an advantage,
ceval
does not recursively call itself to evaluate symbols in
lists; It instead calls the macro EVALCAR. EVALCAR
does
symbol lookup and memoization for symbols, retrieval of values for ILOC
s
and GLOC
s, returns other immediates, and otherwise recursively calls
itself with the CAR
of the list.
ceval
inlines evaluation (using EVALCAR
) of almost all
procedure call arguments. When ceval
needs to evaluate a list of
more than length 3, the procedure eval_args
is called. So
ceval
can be said to have one level lookahead. The avoidance of
recursive invocations of ceval
for the most common cases (special
forms and procedure calls) results in faster execution. The speed of
the interpreter is currently limited on most machines by interpreter
size, probably having to do with its cache footprint. In order to keep
the size down, certain EVALCAR
calls which don’t need to be fast
(because they rarely occur or because they are part of expensive
operations) are instead calls to the C function evalcar
.
Top level symbol values are stored in the symhash
table.
symhash
is an array of lists of ISYM
s and pairs of symbols
and values.
Whenever a symbol’s value is found in the local environment the pointer
to the symbol in the code is replaced with an immediate object
(ILOC
) which specifies how many environment frames down and how
far in to go for the value. When this immediate object is subsequently
encountered, the value can be retrieved quickly.
ILOC
s work up to a maximum depth of 4096 frames or 4096
identifiers in a frame. Radey Shouman added FARLOC
to handle cases exceeding these limits. A FARLOC
consists of a
pair whose CAR is the immediate type IM_FARLOC_CAR
or
IM_FARLOC_CDR
, and whose CDR is a pair of INUMs specifying the
frame and distance with a larger range than ILOC
s span.
Adding #define TEST_FARLOC
to eval.c causes FARLOC
s
to be generated for all local identifiers; this is useful only for
testing memoization.
Pointers to symbols not defined in local environments are changed to one
plus the value cell address in symhash. This incremented pointer is
called a GLOC
. The low order bit is normally reserved for
GCmark; But, since references to variables in the code always occur in
the CAR
position and the GCmark is in the CDR
, there is no
conflict.
If the compile FLAG CAUTIOUS
is #defined then the number of
arguments is always checked for application of closures. If the compile
FLAG RECKLESS
is #defined then they are not checked. Otherwise,
number of argument checks for closures are made only when the function
position (whose value is the closure) of a combination is not an
ILOC
or GLOC
. When the function position of a combination
is a symbol it will be checked only the first time it is evaluated
because it will then be replaced with an ILOC
or GLOC
.
EVAL
Returns the result of evaluating expression in
env. SIDEVAL
evaluates expression in env when
the value of the expression is not used.
Both of these macros alter the list structure of expression as it
is memoized and hence should be used only when it is known that
expression will not be referenced again. The C function
eval
is safe from this problem.
Returns the result of evaluating expression in the top-level
environment. eval
copies expression
so that memoization
does not modify expression
.
Next: Program Self-Knowledge, Previous: Data Types, Up: The Implementation [Contents][Index]