[Contents]   [Back]   [Prev]   [Up]   [Next]   [Forward]  

2. Lexical conventions

This section gives an informal account of some of the lexical conventions used in writing Scheme programs. For a formal syntax of Scheme, see section 7.1 Formal syntax.

Upper and lower case forms of a letter are never distinguished except within character and string constants. For example, Foo is the same identifier as FOO, and #x1AB is the same number as #X1ab.

2.1 Identifiers

Most identifiers allowed by other programming languages are also acceptable to Scheme. The precise rules for forming identifiers vary among implementations of Scheme, but in all implementations a sequence of letters, digits, and "extended alphabetic characters" that begins with a character that cannot begin a number is an identifier. In addition, + and - (which can begin numbers) are identifiers. Here are some examples of identifiers:

lambda                   q
list->vector             soup
+                        V17a
<=?                      a34kTMNs

Extended alphabetic characters may be used in identifiers exactly as if they were letters. The following are extended alphabetic characters:

* / < = > ! ? : $ % _ & ~ ^ 

See section 7.1.1 Lexical structure for a formal syntax of identifiers.

Identifiers have several uses within Scheme programs:

The following identifiers are syntactic keywords, and should not be used as variables:

=>           do            or
and          else          quasiquote
begin        if            quote
case         lambda        set!
cond         let           unquote
define       let*          unquote-splicing
delay        letrec

Some implementations allow all identifiers, including syntactic keywords, to be used as variables. This is a compatible extension to the language, but ambiguities in the language result when the restriction is relaxed, and the ways in which these ambiguities are resolved vary between implementations.

The characters ? and ! have no special properties--they are extended alphabetic characters. By convention, however, most predicate procedures (those that return boolean values) are named by identifiers that end in ?, and most data mutation procedures are named by identifiers that end in !.

2.2 Whitespace and comments

Whitespace characters are spaces and newlines. (Implementations typically provide additional whitespace characters such as tab or page break.) Whitespace is used for improved readability and as necessary to separate tokens from each other, a token being an indivisible lexical unit such as an identifier or number, but is otherwise insignificant. Whitespace may occur between any two tokens, but not within a token. Whitespace may also occur inside a string, where it is significant.

A semicolon (;) indicates the start of a comment. The comment continues to the end of the line on which the semicolon appears. Comments are invisible to Scheme, but the end of the line is visible as whitespace. This prevents a comment from appearing in the middle of an identifier or number.

;;; The FACT procedure computes the factorial
;;; of a non-negative integer.
(define fact
  (lambda (n)
    (if (= n 0)
        1        ;Base case: return 1
        (* n (fact (- n 1))))))

2.3 Other notations

For a description of the notations used for numbers, see section 6.5 Numbers.

. + -
These are used in numbers, and may also occur anywhere in an identifier except as the first character. A delimited plus or minus sign by itself is also an identifier. A delimited period (not occurring within a number or identifier) is used in the notation for pairs (section see section 6.3 Pairs and lists), and to indicate a rest-parameter in a formal parameter list (section see section 4.1.4 lambda expressions).
( )
Parentheses are used for grouping and to notate lists (section see section 6.3 Pairs and lists).
The single quote character is used to indicate literal data (section see section 4.1.2 Literal expressions).
The backquote character is used to indicate almost-constant data (section see section 4.2.6 Quasiquotation).
, ,@
The character comma and the sequence comma at-sign are used in conjunction with backquote (section see section 4.2.6 Quasiquotation).
The double quote character is used to delimit strings (section see section 6.7 Strings).
Backslash is used in the syntax for character constants and as an escape character within string constants (section see section 6.7 Strings).
[ ] { }
Left and right square brackets and curly braces are reserved for possible future extensions to the language.
Sharp sign is used for a variety of purposes depending on the character that immediately follows it:
#t #f
These are the boolean constants (section see section 6.1 Booleans).
This introduces a character constant (section see section 6.6 Characters).
This introduces a vector constant (section see section 6.8 Vectors). Vector constants are terminated by ) .
#e #i #l #s #b #o #d #x
These are used in the notation for numbers (section see section 6.5.3 Number syntax).

[Contents]   [Back]   [Prev]   [Up]   [Next]   [Forward]