Type Checking

-- Why? Rationale

-- What? Form of types

-- How? Mechanism of a type checker

Why do Type Checking?

Even better than catching type errors at runtime (like Scheme does)
is to catch them before the program is ever run. This is called *static
type checking*. It has been done for a long time (eg in Algol and Pascal)
but only recently became popular in industrial settings with the advent
of Java.

The basic principle is that a tool called a type checker is run on the
program before execution. If the type checker finds no errors, the program
is guaranteed to run without any type errors happening during execution.
This is called *soundness*.

Unfortunately, type checker aren't *complete*: they can't show
that every type-safe program will execute OK. So there's a tradeoff. Statically
typed languages must to some extent rule out programs that would otherwise
be legal. Most of these are programs that you wouldn't want to write anyway
(eg, the type error is in dead code). How many reasonable programs are
ruled out depends on how sophisticated the type checker is. Research in
type checking focuses on trying to allow more programs while still preventing
errors. The checker you'll work with in the problem set is very limited
and rules out lots of reasonable programs. Java's type checker rules out
very few.

We discussed a variety of expressions:

-- Expressions that would fail that are caught :-)

(if #t (+ 1 #f) 3)

-- Expressions that would fail that aren't caught :-(

(if #t (/ 1 0) 3)

-- Expressions that wouldn't fail but are stil rejected :-(

(if #t 1 (+ 1 #f))

(if #t 2 #f)

(cons 1 2)

(cons #t nil)

Primitive Types

-- Symbol

-- Number

-- String

-- Boolean

Type constructors

-- Procedure arrow

-- List

-- Pair

(lambda (a) a)

cons

map

Don't confuse with *overloading*! In many languages + can be used
on integers and strings, but these are really different procedures.

In the problem set type checker there's no polymorphism. It's much harder to deal with. So all our lists will be ists of numbers, and cons will have the type:

cons : Number, List<Number> -> List<Number>

rather than

cons : Number, List<A> -> List<A>

Draw a picture illustrating this.

For combinations:

-- find the type of the operator; say it's A -> B

-- find the type of the operand; it'll have to be A

-- the expression as a whole has type B

Example:

-- (+ 1 (+ 2 3))

The whole strategy is recursive

-- to find the type of an expression, find the types of its parts

-- put them together using the typing rules

What about an if-statement?

-- predicate must have type Bool

-- each branch must have some type T

-- then expression has type T

Examples:

-- (if (< 3 4) 'a 'b)

-- (if (< 3 'a) 'a 'b)

-- (if #t 'a 3)

But how do we get going?

-- need to know types of primitives, eg +

-- have a type environment that binds names to types

What about lambda?

-- to make life easier, we'll change the syntax

-- programmer must tell us the types:

(lambda RETURN-TYPE ( PARAM PARAM ... ) BODY)

PARAM has form (NAME TYPE)

for example: (lambda number ((x number) (y number)) (+ x y))

Now the rule is

-- find the type of the body, using the types of the params

-- check that it's the result type

-- type of expression is argtypes -> result type

Examples:

-- (lambda number ((x number)) x)

-- (lambda (listof number) ((x number)) (cons x nil))

-- (lambda (-> (number number) number) () +)

How do we "use the types of the params"?

-- we need a TYPE ENVIRONMENT

-- binds names to types

-- at any point, we have a TE; start of with a global TE

-- to analyze body of lambda, extend TE with bindings of params

Example:

-- (lambda boolean ((x number)) (lambda boolean ((y number)) (<
x y)) 3))

Type checking is a strange form of execution!

-- binds names to types, not values

-- "executes" all paths: eg, both branches of an if-statement

-- always terminates

A puzzle

-- Why does the type checker not need to form double bubbles?

-- how is lexical scoping reflected?

What about defines?

-- must declare type of variable

-- allows recursive definitions

What about abstract types?

-- problem set has clever way to handle

-- involves representing TE with two tables

Daniel Jackson

November 11, 1999