[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: cheerful static typing (was: Any and Every... (was: Eval))




In thinking about this on the way to work I decided that the whole
issue of `static or dynamic type checking' is a red herring.  What we
are striving for is `ease of expression'.  This has two components:
it should be easy to express something meaningful, and it should be
hard to express something meaningless (in particular, it shouldn't be
too easy to write syntactic token strings that only *appear* to be
meaningful).

<begin long rambling exposition>

One way of ensuring that expressions are meaningful is to
automatically validate them in some way.  There are many tools for
this.  For example, the language parser will ensure that token strings
are well-formed.  A type system is another popular mechanism.

To draw on the example of the original post, suppose there were two
functions, F and G, where F `must take an even integer' and G `returns
an integer'.  The question was what to do if the user attempts to
compose F and G, i.e., we need to assign a meaning to the expression
`F(G(x))'.  This seems to me to be a typical scenario in programming.

There are a couple of ways of assigning meaning to `F(G(x))'.  What
does it mean when someone says `F must take an even integer'?  I
assume that it means that the behavior of F when given an odd integer
is `undefined' (which means that anything may happen up to and
including erasing my hard disk and causing monkeys to fly out of my
butt).  So F(G(x)) is `undefined' when G returns anything but an even
integer.

I want to avoid anal primate expulsion, so undefined expressions are
truly a bad thing.  One way to `ensure' that expressions are
completely defined is to simply prohibit ones like `F(G(x))' on the
grounds that the types do not match.

Another option is to `widen' the domain of F such that it has a
defined behavior for any value whatsoever.  You could, for example,
not use F, but some H(F) where H(F) returns a function F' that is the
same as F on even integers, but invokes a debugger for any other
value (or signals an error, or silently exits, etc. etc.)

It seems likely to me that if I were tempted to write `F(G(x))' in the
first place that I may have had an `x' in mind that satisfied the
condition.  However, unless G is simple enough, it may be difficult or
impossible to determine the entire set of `x' for which that holds.
(It may even be empty; I could be mistaken about the `x' I had in
mind.)

What has this to do with type systems?  Not all that much.  (Which is
why it is a red herring).  The question comes down to what to do about
`F(G(x))'.  (I'll make a generalization here, I hope I don't get
flamed too much.)  The static typing camp essentially says ``the
computer shouldn't try to guess the programmer's intent when he writes
F(G(x)).  Raise an error.''  The dynamic typing camp essentially says
``the programmer `obviously' meant F'(G(x)), so I'll just substitute
F' here.''  

Both camps are in agreement that undefined `expressions' are bad, and
both agree that the sooner an error is detected the better.  (I don't
think anyone would argue that `obviously erroneous code' such as `(car
32)' should simply be compiled as a jump to the error handler and no
warning given at compile time.)

Both agree on this, too:  that it is already hard enough to express
yourself within the language, and anything that can make it easier is
a good thing.  The static typing camp has worked hard to develop tools
(like richer type specification and type inference) so that the
computer will not have to `guess' the programmer's intent; it will be
obvious from the code.  The dynamic typing camp has worked hard to
develop tools (like `soft' typing) to refine the kind of substitutions
that the compiler will be allowed to perform when encountering type
mismatches. 

<end long rambling exposition>

The *real* issue is one of `type decoration', i.e., putting in little
tokens here and there to give hints to the compiler as to what is
going on.  No one wants to do this.  The dynamic typing camp
sacrifices a certain amount of `compile-time' error checking to avoid
having to decorate the code.  The static typing camp uses smart tools
to minimize decoration, and they `grin and bear it' otherwise.  To
caraciture both camps, the dynamic type camp says `it is so painful to
type `integer' that I *never ever* am willing to do it, but I'll live
with runtime exceptions' whereas the static-typing camp says `My type
system is so smart I almost *never* have to add declarations, but the
thought of the computer `guessing' my intent and `dwimming' my code is
*anathema*.  So I'd rather type the occasional type declaration to
avoid it.' 

As I mentioned before, I'm very much in the `dynamic' camp.  When I
have to work with a language which is not only statically typed, but
provides no tools for reducing the amount of type decoration, and
furthermore *still* allows one to write meaningless expressions that
*appear* to be meaningful, I end up feeling encumbered.  So I don't
think of `lightweight' and `type declarations' go together very well.
I'm sure some agree and others disagree.

~jrm