Estimate of message entropy in compiled form: 19.1 kB.
There are bugs and rough edges in the message. Please be forgiving. It'll all get fixed in an instant of galactic time...
The easily-human-readable source code for the message is given below. For fun, you can also look at a less readable, alien-esque form of the message, updated periodically from the compiled form of the message.
The advantage of using a programming-code-like language is that the reader can play with hypothethicals at any time, and experiment to evaluate alternative statements that are not in the message.
Current status: Playing with integrating the Java virtual machine. Looks doable, and Java bytecode can be generated now from many languages (including Scheme). Also trying to make the core functional notation more readable. It is in transition, and some things are broken.
Functions currently introduced through examples, rather than completely defined in terms of other functions:
The generated message currently consists of a sequence of 4 symbols.
number symbol meaning 0 . binary digit zero 1 : binary digit one 2 ( marks beginning of an expression 3 ) marks end of an expression
And two pseudo-symbols coded using the above:
sequence symbol meaning () / opens an implicit paren, which will close at next paren (A B / C / D) is another way to write (A B (C (D))) This greatly simplifies complex expressions. (()) ; marks end of sentence
Numbers are encoded as binary digits between parentheses, e.g. (:::.) is 1110 base 2 which is 14 in decimal. A set of numbers between parentheses constitutes an expression. Expressions can be nested. Expressions followed by a semicolon should evaluate to be true, once the rules for evaluation have been introduced.
In the human-readable form of the message, decimal numbers can be used. There are converted to the above form. Identifiers can also be used. Identifiers are mapped onto arbitrarily assigned numbers. In the message, there is nothing to distinguish identifiers from numbers. The actual language is carefully constructed so that this distinction is never necessary.
The first number in an expression is treated as an index into an environment that returns a function. When the lambda notation is introduced, it works by modifying that (nested) environment. Expressions are evaluated from the outermost inwards, from left to right, and the "if" form is introduced as lazy.
This "functional style" of expression is not always particularly easy to follow, even for a human, but it is certainly very expressive. Currently functional definitions are given alongside numerous examples that are in many cases sufficient by themselves to communicate the definition, at least for working purposes. It is probably important to maintain this duality and perhaps extend it with other forms of expression. There are so many models of computation, why not use all of them? Perhaps one will be easier for the reader to follow than the others.
While it is tempting to try to make the message airtight from a formal point of view, defining everything in terms of axioms, this is just one didactic approach - and may be counter-productive or impossible for a a large-scale message that includes AI-complete concepts.
Currently there is a conflict between using definitions of functions that are easy to communicate, and definitions that are efficient (or external). This will require some more thought. For example, it would be nice to introduce "if" in its pure lambda calculus form, but to do so would slow checking down right now. The "if" function is instead built in, and (to add insult to injury) introduced with lazy evaluation. It would be more consistent to keep everything eager to begin with, and then show the evaluator being rewritten to facilitate laziness -- easy using the standard trick of wrapping conditional expressions in single-argument functions.
As a completely separate thread, I have started to integrate a presentation of unless gates.