[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: choosing a language for your audience or macros vs syntax?

To: address@hidden
Subject: Re: choosing a language for your audience or macros vs syntax?
From: Shriram Krishnamurthi <address@hidden>
Date: Sat, 16 Nov 2002 15:02:53 -0500
In-reply-to: <nm9d6p8xcat.fsf@scrubbing-bubbles.mit.edu>
References: <5.0.2.1.2.20021111213530.0201e220@zima.bbn.com><nm9d6p8xcat.fsf@scrubbing-bubbles.mit.edu>
Sender: address@hidden

Bruce pointed out the need for LAMBDA to make a macro system
effective.  I'd like to point out that tail calls are equally
important.  Indeed, there is one even more obscure primitive:
((LAMBDA.  To wit, here's a note I wrote to one of my students about
this.  It's not written particularly well, but hopefully the example
is illustrative enough.

Shriram

----------------------------------------------------------------------
As you know, in some Scheme systems,

  (let ((v e) ...) b)

turns into

  ((lambda (v ...) b) e ...)

For instance,

  (let ((x 3) (y 2)) (+ x y))
  ==>
  ((lambda (x y)
     (+ x y))
   3 2)

There is, however, a performance difference between the two forms.
You can implement LET by extending the current stack frame with one
more binding (if you build a good compiler, this part is almost a
no-op).  In contrast, the ((LAMBDA ...) ...) code requires you to
create a new closure and then apply it -- both relatively more
expensive operations.  [By the way, Schemers typically refer to
((LAMBDA ...)  ...) in speech as "left-left-lambda".]

Given this expense, you might think it silly for a Scheme system to
implement the LET-to-LAMBDA macro: why take an efficient
source-language instruction and make it less efficient behind the
programmers back?  A fair question.

There is at least one Scheme compiler that rejects this line of
thought.  It blindly replaces all source instances of LET with
instances of ((LAMBDA ...) ...).  It then seeks out instances of
((LAMBDA ...) ...) in the program and turns them back into LETs so
that it can compile them efficiently.

This probably strikes you as doubly stupid behavior.  It's bad enough
that you replaced all LETs with ((LAMBDA ...) ...); then you walk the
code and convert them all back as an optimization!  When can this ever
make sense?  If every ((LAMBDA ...) ...) in the code was generated by
transformation from a source LET, then this is a really dumb strategy.
But how else would a ((LAMBDA ...) ...) arise?  Surely no human would
ever write it: just compare

  ((lambda (x y)
     (+ x y))
   3 2)

with

  (let ([x 3]
        [y 2])
    (+ x y))

and that's obvious.  But note the operative phrase: "no human".  Who
else might write code if not a human?

Why, a program, of course.  And Scheme programs are full of exactly
such program-generating programs, called macros.  The example we saw
of a transformation above is just one of them.  Let's consider a
different macro.  Here is an extremely simplified pattern matcher:
the user writes

  (match
   [pat1 act1]
   [pat2 act2]
   ...)

(I'll leave it to your imagination to fill in an example, but you can
consider, for instance, a COND expression, which has similar form.)

How shall we implement MATCH?  One simple way is through the following
macro transformation:

  (match
   [pat0     act0]
   [pat-rest act-rest]
   ...)
  ==>
  (lambda (v)
    (if (pattern=? v pat0)
      act0
      ((match
        [pat-rest act-rest]
        ...)
       v)))

That is, a pattern matcher consists of one or more pattern-action
pairs.  (We ignore the empty case for simplicity.)  This yields a
function that consumes the actual value (V) to check.  The matcher
compares V against the first pattern (using PATTERN=?).  If the
comparison is successful, it invokes the first action.  Otherwise it
invokes the pattern-matcher of the remaining clauses -- which is a
function of one argument, the value to check -- on V.  Spend a moment
convincing yourself this makes sense.

While this makes sense, it's not very efficient.  It expands into code
that has this form:

  (lambda (v0)
    (if (pattern=? v0 pat0)
        act0
        ((lambda (v1)
           (if (pattern=? v1 pat1)
               act1
               ((lambda (v2)
                  (if (pattern=? v2 pat2)
                      act2
                      ...))
                v1)))
         v0)))

I've used different names for V just to make it easy to keep them all
apart.  In fact, however, all Vs are really just the same value.  But
every time one clause fails to match, the matcher creates another
closure and then applies it.  As a result, even if the programmer
wrote a pattern matching sequence that contained no memory-allocating
code, the code might yet allocate memory!  That would be most
unwelcome behavior.

Fortunately, the situation is actually quite pleasant.  Having written
a very straightforward macro, we generate the code above.  The
compiler, however, notices several instances of the ((LAMBDA ...) ...)
pattern.  It immediately collapses these, producing the code

  (lambda (v0)
    (if (pattern=? v0 pat0)
        act0
        (let ([v1 v0])
          (if (pattern=? v1 pat1)
              act1
              (let ([v2 v1])
                (if (pattern=? v2 pat2)
                    act2
                    ...))))))

In fact, since the compiler can now see that these LETs are now
redundant (all they do is rename a variable -- modulo the actions), it
can remove them, resulting in this code:

  (lambda (v0)
    (if (pattern=? v0 pat0)
        act0
        (if (pattern=? v0 pat1)
            act1
            (if (pattern=? v0 pat2)
                act2
                ...))))

Look carefully at this code.  This is pretty much exactly what you
would have been tempted to write by hand.  In fact, read it and it's
obvious that it implements a simple pattern matcher.  Furthermore, it
has just the right "interface": you typically want a matcher to be a
first-class value that you can apply to the value you want to match
against.  This is exactly the interface the above code provides.  It
did this by recursively generating lots of functions, but a smart
choice of compiler "primitive" -- ((LAMBDA ...) ...), in this case --
means that all the inner applications disappear, leaving just the
outer function to accept the value.  A smarter compiler can even
convert the cascading IFs into a more efficient dispatch (eg, through
a hash table) if that might make sense.

References:
- choosing a language for your audience or macros vs syntax?
  - From: Ken Anderson <kanderson@bbn.com>
- Re: choosing a language for your audience or macros vs syntax?
  - From: Bruce Lewis <brlewis@alum.mit.edu>

Prev by Date: Re: Database languages (was Re: Macros Make Me Mad)
Next by Date: Re: choosing a language for your audience or macros vs syntax?
Previous by thread: Re: choosing a language for your audience or macros vs syntax?
Next by thread: FYI: SISC
Index(es):
- Date
- Thread