[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: More on dispatching by function return type



re:    Dispatching on the type of the continuation changes a procedure
        from a function to a relation with a constraint.

This is one of the major gaps in languages of purely static typing (strong
or
otherwise); thank you for making it so obviously clear in so few words.

[metanote: this is my first posting on this or any other recent thread of
LL1, so there are more than a couple pent-up thought coming out now; I hope
the msg doesn't exceed any formal limits; at least it's only about 1K
words.]

Others earlier have commented on this weakness as stemming from a lack of
declarational forms applying to intermediate computational expressions
(except of course for the THE construct in Common Lisp).   As a result,
sooner or later, the StaticType programmer abandons the effort to supply a
type for such "intermediate" expressions---a type which nevertheless can be
statically determined by the programmer at design time, but cannot be
mechanically inferred by the compiler.

Or, where permitted by the language, one resorts to the addition of a
one-time-use local variable solely as a means to supply the type of an
intermediate expression---in short, one has to  FORTRANise  one's code.  Of
course, I realise that most "cast-free" languages will statically prevent
you from doing this, in which case you might take the following alternative.
Worse yet, one writes multiple copies of the callee---one for each possible
type alternative---and then re-writes the caller into a hand-coded dynamic
dispatch; all this to avoid 'type incorrectness."  BTDT (in Java, alas), how
about you?

All this, in the name of "modular" programming?

But how can the programmer safely supply what the type-constraining compiler
cannot infer?   Well often it is due to dynamic operations in the callee
which are triggered by  1. _pairs_  of  types on arguments, or  2. worse yet
by  coordinations between particular _values_ of arguments, akin to the
point in this sub-thread about keyword arguments to shell commands.  For a
small set of particular values (especially for "keywords"), Common Lisp
provided the MEMBER type operator to capture the notion---in the type
calculus---of a (small) _set_ of distinguished values as being a part of the
types system.  (In practice "small" means "less than about a million" due to
performance limits on even the best implementational techniques.)

In addition to providing for placing types on anonymous locations (the THE),
and specific, individual values or sets of values (the MEMBER), CommonLisp
weakly provides a relational-like capability for inferring the types of a
procedure result from the coordinated set of argument types (I'm sure some
other languages do also.)  Many "industrial strength" CL implementations
made extremely good use of this "weak" provision, especially when reducing
compiled arithmetic code to that equivalent to the best of purely static
typing languages---while still retaining all the benefits of a fully,
strongly, dynamically typed language.  Of course, a programmer may use such
typing techniques solely as an adjunct to "catching errors" earlier; either
rationale is sufficient justification for the extended complexity of the
types calculus.

In these cases, the (CommonLisp) functions in question are still "functions"
and not as you suggested "relations"---albeit they are true generic
functions, which few languages seems to have comprehended.  But until you
mentioned it, I hadn't explicitly thought to refer to the extended types
caclulus over these partially-typed situations  as "a relation" in such a
was as to characterise the nature of fully _generic_ functions.

However, that (compile-time) relational-dispatch is not the end of the
"useful" part here.  We quite frequently have the need to look a relations
between values before adequately inferring the types of a procedural result;
for this, CommonLisp provides the SATISFIES type operator, which ultimately
not only makes a compile-time closure on type impossible, but ultimately
wanders into the general Undecidability problem.  [footnote: One might think
that by allowing the compiler to try out all possible values, or types of
values---actually executing the procedure on all such value combinations
before compiling the optimal version of it---you could get a formal closure
on the SATISFIES operator, but if  the boundedness of space and time doesn't
get you here, the general undecidability of loop termination will.  In fact,
from
personal experience, I saw no other way to optimise certain SQL statements
for
the MS SQLServer except this way, and yes that did provide for unbounded
"fun
and games."]

Some would have us believe there are no such "intermediate
expressions"---that if only we try hard enough, every conceivable,
understandable program can be phrased in such a way that all "Procedure"
return values can be typed specifically enough to be useful to the type
calculus.  (If you do happen to have such a proof ready, I hope you can
spare us the email bandwidth on it and instead pass it along to the
successor to the Journal of Irreproducible Results.[*])  The anecdotal
evidence
seems to show the same negative view by pointing out that general UNION-like
expressions in the types calculus can blur the information content of the
typed result such that neither good code-optimisation nor satisfactory error
checking can take place.

Mentioning SQL above was with malice aforethought.  It's semantics are akin
to what you (Joe) seem to be calling for---an extension, for at least the
purposes of the types calculus---of procedure semantics into a more
relational semantics.  Some years ago, Michael Genesereth at Stanford
University made a similar observation about dealing with large
agglomerations of data, and devised a project called Infomaster in which the
programming language was based around a first-order logic theorem prover
over relational sentences.  It turned out to be far more powerful than SQL,
and was actually gobbled up by CommerceOne as the basis of its eCommerce
database engine in 2000; but as most of you already realise, after 9/11 (of
2001) the overwhelming majority of these dotcom type companies went belly-up
due to a diminished marketplace. [Footnote: CommerceOne returned the
intellectual property rights to Genesereth, and there may be still further
research going on in this area.]


I had intentions of commenting on the sub-thread that GLS (Guy Steele)
commented upon, namely the suitability or otherwise of the conflation
between the data Types calculus and the data object Classes calculus.   But
this msg has gone on too long, so perhaps another day . . .



-- JonL --


[*]: Indeed the death of the Journal of Irreproducible Results was announced
"recently" (within the last year or two I think?) and a named successor was
mentioned, whose name slips my mind right now.



----- Original Message ----- 
From: "Joe Marshall" <jrm@ccs.neu.edu>
To: <ll1-discuss@ai.mit.edu>
Sent: Thursday, December 18, 2003 11:39 AM
Subject: Re: More on dispatching by function return type


> Ken Shan <ken@digitas.harvard.edu> writes:
>
> > Despite all these potential problems with a hypothetical program like
> > Make, I take as a given that we want such a thing.  Make itself is
> > clearly not perfect; what I am taking as given is that we want an
> > extensible program that builds and executes a pipeline of conversions,
> > even for trivial (single-step) pipelines.  Such programs abound in
> > current software practice: mail readers, Web browsers, printing systems,
> > clipboard interfaces, and text editors do it all the time.
> >
> > As soon as you have such a program, you are dispatching on the return
> > type of a function without invoking it.
>
> I guess what I am saying is this:
>
>    Dispatching on the type of the continuation changes a procedure
> from a function to a relation with a constraint.
>
> Programs based on relations and constraints are sometimes better,
> sometimes worse, than programs based on composition of functions.  I
> think my original objection was that I thought it was function, but it
> was really a constraint in functional clothes.
>
> ---------------
>
> To answer other questions:
>
> > Do you mean that dvi2ps would be happy to produce ps if "asked to" on
> > the command line, and also happy to produce pdf if "asked to" on the
> > command line?
>
> Yes.
>
>
> > Then the type of dvi2ps would not be "dvi -> (union ps pdf)" in
> > usual notation; that type means that dvi2ps may produce either ps or
> > pdf, at -its- discretion.
>
> As far as other program are concerned, this really is the case.  The
> program may base its choice solely on that single command line
> argument, but the fact is that sometimes dvi2ps produces a ps file,
> and sometimes a pdf file.
>
> > The type that the just-described dvi2ps would take is a conjunction
> > (intersection), not a disjunction (union): "dvi -> (intersection ps
> > pdf)", or equivalently, "(dvi->ps, dvi->pdf)".
>
> Intersection would be a file that is both a pdf and a ps file.
>
>
> But this is just leading us to a situation where we'll have to design
> a type language expressive enough that one can reason about classes of
> output based on command line switches, and that isn't particularly
> on subject.