Scheme for Software Engineering

Date: Fri, 22 Dec 95 12:12 EST
From: jaffer (Aubrey Jaffer)

Date: Thu, 14 Dec 95 12:12:59 -0800
From: ...

From: jaffer (Aubrey Jaffer)
But why do this in C? You will have to write excruciating macros or a parser.

The excruciating macros already exist, in almost every Scheme implementation, including SCM. I'm just suggesting we try to standardize the macros more.

I almost never write C code directly anymore. I only mess with it to fix bugs. As with all the code I produce for jobs, the new code for SCM (the only thing recently has been bignum logical operations) is generated or at least started from the output of my schlep compiler.

I spent some effort to make Schlep produce formatted C code; Most programmers I work with are not fluent in Scheme and they have to be able to understand my code contributions.

It is very easy to generate C code from Scheme and you will face none of the `text processing' limitations of C.

Show me ...

SCHLEP generates .c and .h files. SCHLEPH generates just a .h file. SCHLEPH generates macros for everything in the source .scm file. If a construct can't be reasonably translated, a message is produced instead.

Schlep.scm maps primitive C types from the analogues types in scheme.

Function names ending in ! are typed as void. Identifiers ending in ? are typed as int (or int functions as determined from context) and "_P" is appended to the C name. The last characters of other identifier names determine their C types in declarations according to the association list TYPTRANS (this is expedient, not clean). The CDR of the association has either a token or list of tokens, the first of which specifies an encapsulation (ARRAY, PTR, or FUNCTION) of the second element. Of course, a polished compiler would come up with a uniform convention for names. The default type is int.

Someone will probably be screaming at this point that a civilized soul would use declarations -- I did add declarations at one point (they are probably still in there), but I had to take great care to have declarations included before they were used and all the dependencies were more trouble than they were worth. With an unambiguous name->type mapping, order of compilation is much less of an aggravation.

It is not quite so easy if you have to deal with with low-level stuff. I.e. you have to distinguish low-level expressions using unboxed C types, from Scheme expressions. The former can be trivially written out in to C syntax; the latter require more massaging.

Schlep currently deals only with the low-level stuff. It doesn't even know anything special about lists. Lest that seem limited, let me point out that with this system, I have generated and extensively tested more than 400k of Scheme code in less than a year. Most of this code compiles into C and is currently being shipped in NT and DOS device drivers.

The appeal of working this way is that I can debug all my code in Scheme before generating C modules. Very little glue is needed to make the structure definitions (even of non-schemable fields) compatible between Scheme and C. I started to write compilation for SLIB structure macros to .h, but I don't have a strong enough need to push it up in my queue.

One thing that I have learned is that interfaces between modules should be through function calls, not through #define macros. If that seems inefficient, move the interface plane upward a level.

On my list of things to do is to write a COMPILESCM which generates wrappers for the generated C functions which would enable them to be called from SCM. This doesn't take much effort. Here are examples of hand coded wrappers for DOS functions:


char s_outp[] = "outp";
SCM l_outp(addr, val)
     SCM addr, val;
{
  ASSERT(INUMP(addr),addr,ARG1,s_outp);
  ASSERT(INUMP(val),val,ARG2,s_outp);
  outp((unsigned)INUM(addr), (unsigned char)INUM(val));
  return UNSPECIFIED;
}
char s_inpw[] = "inpw";
SCM l_inpw(addr)
     SCM addr;
{
  ASSERT(INUMP(addr),addr,ARG1,s_inpw);
  return MAKINUM(0xffffL & inpw((unsigned)INUM(addr)));
}

An interesting twist is to write wrappers for SCM functions so that C code can call them. I did this in order to debug the interface between an AMD C program for programming MACH PLDs through a JTAG interface and routines for accessing my hardware. These wrappers are very repetitive; The next time I have to write SCM-from-C wrappers, I will extend schlep.scm to do it.
static char crw_frmt[] = "(config:read-word #x%x #x%x))";
word config_read_word(word bus_dev_func, byte register_number)
{
  char str[256];
  if (256 <= sprintf(str, crw_frmt, bus_dev_func, register_number))
    emu_ovrn(crw_frmt);
  return num2ushort(scm_evstr(str),RET1,crw_frmt);
}

static char wcd_frmt[] = "(config:write-dword! #x%x #x%x #x%x))";
int write_config_dword(word bus_dev_func,
		       byte register_number,
		       dword dword_to_write)
{
  char str[256];
  if (256 <= sprintf(str, wcd_frmt,
		     bus_dev_func, register_number, dword_to_write))
    emu_ovrn(wcd_frmt);
  scm_evstr(str);
  return 0;
}

I do agree this approach is appealing, but could you be more specific? Some examples?

Here is some Scheme code:

(define init-level-str-ara
  #("Uninitialized"
    "Bus found"
    "Board found"
    "Board configuration registers found"
    "Board partially mapped to memory"
    "Board state initialized"
    "Board fully mapped to memory"
    "Board UART initialized"
    "Board interrupt initialized"
    "Drivers registered"))

;;; UNINIT-CDL-BOARD-TO-LEVEL assumes that DESIRED-INIT-LEVEL is less
;;; than (CDL:INIT-LEVEL BDI).

(define (uninit-cdl-board-to-level bdi desired-init-level)
  (define current-init-level (cdl:init-level bdi))
  (cond
   ((= current-init-level desired-init-level) current-init-level)
   (else
    (case current-init-level
      ((9) (amcc:disable-incoming-mailbox-interrupt! bdi)
	   (CDL:SET-INIT-LEVEL! bdi 8)
	   (uninit-cdl-board-to-level bdi desired-init-level))
      ((8) (CDL:SET-INIT-LEVEL! bdi 7)
	   (uninit-cdl-board-to-level bdi desired-init-level))
      ((7 6 5 3 1) (CDL:SET-INIT-LEVEL! bdi (+ -1 (cdl:init-level bdi)))
		   (uninit-cdl-board-to-level bdi desired-init-level))
      ((4)
       (do ((max-offset 0)
	    (i 0 (+ 1 i)))
	   ((> i 4)
	    (cond
	     ((eqv? -1 (cdl:free-segment
			bdi
			(quotient (+ max-offset page-size 64 -1) page-size)))
	      (edprintf
	       "uninit-cdl-board-to-level: could not free physical segment.\\n"
	       ))))
	 (cond ((not (eqv? #xffffffff (CDL:OFFSET bdi i)))
		(set! max-offset (max max-offset (CDL:OFFSET bdi i)))))
	 (CDL:SET-OFFSET! bdi i #xffffffff))
       (CDL:SET-INIT-LEVEL! bdi 3)
       (uninit-cdl-board-to-level bdi desired-init-level))
      ((2) (CDL:SET-ID! bdi -1)		;because ID is only 16 bits wide.
	   (CDL:SET-INIT-LEVEL! bdi 1)
	   (uninit-cdl-board-to-level bdi desired-init-level))
      (else
       (edprintf "uninit-cdl-board-to-level: desired-init-level out of range: %d\\n"
		desired-init-level)
       current-init-level)))))

Schlep generates the .h file code:

extern char *init_level_str_ara[];

int uninit_cdl_board_to_level(int bdi,int desired_init_level)	;

Schlep generates the .c file code:

char *init_level_str_ara[] =
{"Uninitialized",
   "Bus found",
   "Board found",
   "Board configuration registers found",
   "Board partially mapped to memory",
   "Board state initialized",
   "Board fully mapped to memory",
   "Board UART initialized",
   "Board interrupt initialized",
   "Drivers registered"};


/* UNINIT-CDL-BOARD-TO-LEVEL assumes that DESIRED-INIT-LEVEL is less*/
/* than (CDL:INIT-LEVEL BDI).*/


int uninit_cdl_board_to_level(bdi, desired_init_level)
     int bdi;
     int desired_init_level;
{
L_uninit_cdl_board_to_level:
  {
    int current_init_level = cdl_init_level(bdi);
    if ((current_init_level)==(desired_init_level))
      return current_init_level;
    else switch (current_init_level) {
    case 9:
      amcc_disable_incoming_mailbox_interrupt(bdi);
      cdl_set_init_level(bdi, 8);
      goto L_uninit_cdl_board_to_level;
    case 8:
      cdl_set_init_level(bdi, 7);
      goto L_uninit_cdl_board_to_level;
    case 7:
    case 6:
    case 5:
    case 3:
    case 1:
      cdl_set_init_level(bdi, -1+(cdl_init_level(bdi)));
      goto L_uninit_cdl_board_to_level;
    case 4:
      {
	int i = 0;
	int max_offset = 0;
	while (!((i)>4)) {
	  if (0xffffffff!=(cdl_offset(bdi, i)))
	    max_offset = max(max_offset, cdl_offset(bdi, i));
	  cdl_set_offset(bdi, i, 0xffffffff);{
	    i = 1+(i);
	  }
	}
	if (-1==(cdl_free_segment(bdi, ((max_offset)+(page_size)+64+-1)/(page_size))))
	  dprintf((diagout, "error: uninit-cdl-board-to-level: could not free physical segment.\n"));
      }
      cdl_set_init_level(bdi, 3);
      goto L_uninit_cdl_board_to_level;
    case 2:
      cdl_set_id(bdi, -1);
      cdl_set_init_level(bdi, 1);
      goto L_uninit_cdl_board_to_level;
    default:
      dprintf((diagout, "error: uninit-cdl-board-to-level: desired-init-level out of range: %d\n", desired_init_level));
      return current_init_level;
    }
  }
}

Notice that tail-recursion is handled correctly; named lets also work. Notice that top-level comments are preserved. Do nothing strings in Scheme clauses are also turned into C comments.

To recap:

Copyright 1995 Aubrey Jaffer

I am a guest and not a member of the MIT Computer Science and Artificial Intelligence Laboratory.  My actions and comments do not reflect in any way on MIT.
SCM for Engineering
agj @ alum.mit.edu
Go Figure!