Re: Sealed as Source Annotation

To: info-dylan@ai.mit.edu
Subject: Re: Sealed as Source Annotation
From: Bruce Hoult <bruce@hoult.org>
Date: Thu, 13 Jul 2000 20:45:02 -0400 (EDT)
Organization: The Internet Group Ltd
References: <slrn8ms4vt.9ml.mkg@asimov.lanl.gov>
User-Agent: MT-NewsWatcher/3.0 (PPC)
Xref: traf.lcs.mit.edu comp.lang.dylan:12458

In article <slrn8ms4vt.9ml.mkg@asimov.lanl.gov>, mkgardne@cs.uiuc.edu 
wrote:

> I would like to start a philosophical discussion of the merits /
> demerits of directly annotating the source code to seal classes (or
> make them final) versus having the annotations be external to the
> source. I am cross-posting to Dylan and Java newsgroups as those two
> languages are the only ones that I am aware of that allow annotations
> which declare that a class will not be extended.

You will know, but I expect most Java programmers won't, that a sealed 
class in Dylan *can* be extended, but only within the same compile unit.

> Question: Why are internal (i.e., direct source) annotations better
> than external annotations?

I think you have to ask yourself: "what is the purpose of having these 
source annotations".

They might serve merely as a form of documentation of the programmer's 
intent, or they might be to enable better/faster code to be generated.

In the case of Dylan, a large part of the reason for having "sealing" is 
to enable more static, faster, code to be generated.  In order to do 
this, the compiler needs to know about the sealing while it is compiling 
any code that uses that class.  The greatest effect tends to come while 
compiling code that is in the same compile unit as the class itself, 
which means that the sealing has to happen in either the source file 
containing the class, or else perhaps in the source file which contains 
the import/export information for that module.

> My experience with Java has indicated that internal annotations are a
> problem. By declaring a class final so that it cannot be modified, we
> are presupposing that the class is complete and needs no modification
> or extension. Invariably, however, the class is not complete for all
> uses and yet it cannot extended because it is final. The programmer's
> only recourse is to duplicate the functionality of the finalized class
> and then to extend it. This is clearly wasteful of the programmer's
> time, as well as system resources. (As an example, there have been
> discussions concerning the need to declaring Java's String class final
> for efficiency reasons and the attendent limit on extensibility.)

Dylan avoids this problem to a large extent by allowing subclasses to be 
declared in the same compile unit with the sealed class.  So it is not 
that the class has *no* subclasses, but that the list of possible 
subclasses is fixed and known to the compiler (and in many cases is 
empty).

Thus Dylan programmers only have this problem when they want to extend 
someone *else's* sealed class, and in particular, if they want to extend 
a sealed class declared in the runtime library, such as <integer> or 
<single-float> or <byte-string>.  Of course in Java there is no question 
of being able to extend "int" or "float" because they aren't classes in 
the first place.

There is no real way to get around the fact that you can't extend 
<integer>.  There are just too many places in compiled code that the 
compiler takes advantage of the fact that <integer> values are of fixed 
size, are immutable, and have no subclasses thus enabling the compiler 
to use a smaller and faster unboxed representation -- in Java terms, it 
is these properties of <integer> that allows the Dylan compiler to 
automatically use "int" instead of "Integer" in many places in the 
compiled code, and automatically and transparently switch between them 
on assignment, passing function arguments, putting them into collections 
etc.

What Dylan *does* do, is to provide unsealed base classes for all the 
sealed classes in the runtime library.  So <integer> is a subclass of 
<number> and <byte-string> is a subclass of <string>.  The user is free 
to create new subclasses of <number> and <string>.  If you write code 
that has variables and function arguments declared as <number> and 
<string> then you can pass objects of user-defined classes into them, 
but your code will be slower than if you use <integer> and <byte-string>.

> It appears to me that declaring the class to be sealed outside the
> source would allow extension where appropriate without "code reuse by
> copying". Additionally, having the annotations external to the source
> naturally leads to allowing specific instances to be sealed while
> leaving others unsealed.

I'm sorry.  I don't understand your intent.  Do you propose that 
particular *instances* of some class might be sealed/final, but that 
other objects of that class will not be?  How would that work?  What 
would it mean?  As I understand it, sealed/final is an attribute of a 
class, not of an object.  [bindings can also be final (called "constant" 
in Dylan), but that's an entirely different thing]

> Finally, the need for annotations should
> decrease as compilers become better at determining for themselves what
> should be sealed and what should not. From my perspective, external
> annotations appear to complement smart compilers better than do
> internal annotations. (By way of analogy, most modern C compilers
> ignore the "register" keyword and determine for themselves which
> variables should be kept in registers unless a compilation flag is
> set.)

I don't think that's a reasonable comparison.  "register" is a very 
local property.  (Can you make a global variable "register"?)

In Dylan, at least, sealing is mostly useful because of libraries and 
separate compilation [1].  If you compile the whole program at once then 
the compiler can see all the classes and sealing declarations are (in 
theory) unnecessary because the compiler can automatically seal 
*everything*.  When you compile a library, however, the compiler has no 
way of knowing whether users of the library might want to extend classes 
exported by the library, so a programmer declaration -- visible to the 
compiler at the time the library is compiled -- is the only choice.

-- Bruce

[1] in Dylan it is also possible to create new classes at runtime, by 
calling make(<class>, ...).  Most programs don't do this, and most 
programs are written such that the compiler can prove that they don't do 
it.

Follow-Ups:

Re: Sealed as Source Annotation

From: mkg@lanl.gov (Mark K. Gardner)

Re: Sealed as Source Annotation

From: "Maury Markowitz" <maury@remove_this.sympatico.ca.invalid>

Re: Sealed as Source Annotation

From: P T Withington <ptw@callitrope.com>

References:

Sealed as Source Annotation

From: mkg@lanl.gov (Mark K. Gardner)