Interfaces, abstraction, and information hiding
\[ \newcommand{\target}[2]{\class{mathjax-symbol mathjax-symbol-#1}{#2}} \newcommand{\type}[3]{#1 \target{type}{\vdash} #2 \target{type}{:} #3} \newcommand{\abs}[3]{\target{abs}{\lambda} #1 \target{abs}{:} #2\target{abs}{.} #3} \newcommand{\tabs}[2]{\target{tabs}{\Lambda} #1 \target{tabs}{.} #2} \newcommand{\arrow}{\target{arrowT}{\rightarrow}} \newcommand{\subt}{\target{subT}{<:}} \newcommand{\Int}{\mathsf{Int}} \newcommand{\Bool}{\mathsf{Bool}} \newcommand{\lett}[3]{\mathsf{let}\ \mathit{#1} : #2 = #3\ \mathsf{in}} \newcommand{\letu}[2]{\mathsf{let}\ \mathit{#1} = #2\ \mathsf{in}\ } \newcommand{\ite}[3]{\mathsf{if}\ #1\ \mathsf{then}\ #2\ \mathsf{else}\ #3} \DeclareMathOperator{\proj}{.} \newcommand{\object}[1]{\mathsf{object}\ #1\ \mathsf{end}} \newcommand{\call}[2]{#1 \# \mathit{#2}} \newcommand{\method}[2]{\mathsf{method}\ \mathit{#1} = #2} \newcommand{\self}{\mathit{self}} \newcommand{\super}{\mathit{super}} \newcommand{\refc}{\mathsf{ref}\ } \newcommand{\inherit}[2]{\mathsf{inherit}\ #1\ \mathsf{as}\ \mathit{#2}} \newcommand{\pack}[3]{\mathsf{pack}\ (#1, #2)\ \mathsf{as}\ #3} \newcommand{\unpack}[4]{\mathsf{unpack}\ (#1, #2) = #3\ \mathsf{in}\ #4} \newcommand{\override}{\ \mathsf{override}\ } \newcommand{\loc}[1]{l_{#1}} \]
Typing relation. Usually written \(\type{\Gamma}{e}{\tau}\), where \(\Gamma\) is a map from identifiers to types, \(e\) is an expression, and \(\tau\) is a type.
\(\type{\mathrm{Type Context}}{\mathrm{Expression}}{\mathrm{Type}}\)
Anonymous function.
\(\abs{\mathrm{Parameter Name}}{\mathrm{Parameter Type}}{\mathrm{Body}}\)
Type abstraction.
\(\tabs{\mathrm{Type Variable}}{\mathrm{Body}}\)
Function type.
\(\mathrm{Parameter Type} \arrow \mathrm{Return Type}\)
Subtyping relation.
Read \(\tau \subt \tau'\) as: \(\tau\) is a subtype of \(\tau'\). If \(\tau\) and \(\tau'\) are records, \(\tau\) has a superset of the fields in \(\tau'\). Alternatively, anything that expects a \(\tau'\) can take a \(\tau\).
\[ \newcommand{\zero}{\mathrm{zero}} \newcommand{\incr}{\mathrm{incr}} \]
- in previous lectures we discussed modularity—breaking software into smaller pieces that can be understood in isolation
- we discussed objects as a way to reuse and share implementations, and also as a means to write programs that are parameterized over some interface
- our presentation of objects and modules was untyped
- no guarantee that the program actually respects the interface
- what would it mean for a program to not respect an interface?
- accessing private functions or data from the object/module
- relying on "internal details" such as concrete types
- we'd like to use a type system to ensure that our programs adhere to their interfaces
- we also want to be able to hide implementation details from the consumers of a module
Existential Types
- we've discussed universally quantified types as a way to state that a function can accept parameters of any type—its behavior does not depend on the concrete types that it is passed
- we might wonder whether existential types have their own uses?
- intuition
- \(\forall \alpha. \tau\) is a mapping from types \(\sigma\) to values having type \(\tau[\alpha \mapsto \upsilon]\)
- \(\exists \alpha. \tau\) says that there exists some \(\sigma\) such that the term has type \(\tau[\alpha \mapsto \sigma]\)
- operationally, we can treat a value of type \(\exists \alpha. \tau\) as a pair \((e, \sigma)\) of a type \(\sigma\) and a term \(e : \tau[\alpha \mapsto \sigma]\)
- a term with an existential type can be thought of as a simple module or package that has a single hidden type
- for example, we could write a package
- \((\Int, \{ \zero = 0, \incr = \lambda x:\Int. x + 1 \})\)
- having type \(\exists \alpha. \{ \zero : \alpha, \incr : \alpha \rightarrow \alpha \}\)
- of course, we can give this module multiple other types
- \(\exists \alpha. \{ \zero : \Int, \incr : \alpha \rightarrow \alpha \}\)
- \(\exists \alpha. \{ \zero : \Int, \incr : \Int \rightarrow \alpha \}\)
- etc
- we get to choose how much information about the module we keep hidden
- we choose the level of abstraction in the interface
- also means that we need to annotate this type, because we can't reasonably infer it
- therefore, we need to make this type annotation part of the language syntax:
- \(\pack{\underbrace{\Int}_{\text{hidden type}}}{\underbrace{\{ \zero = 0, \incr = \lambda x:\Int. x + 1 \}}_{\text{package implementation}}}{\underbrace{\exists \alpha. \{ \zero : \alpha, \incr : \alpha \rightarrow \alpha \}}_{\text{package interface}}}\)
- (this is the introduction form for existential types—it's the only way to make a value having an existential type)
- \(\pack{\underbrace{\Int}_{\text{hidden type}}}{\underbrace{\{ \zero = 0, \incr = \lambda x:\Int. x + 1 \}}_{\text{package implementation}}}{\underbrace{\exists \alpha. \{ \zero : \alpha, \incr : \alpha \rightarrow \alpha \}}_{\text{package interface}}}\)
- for example, we could write a package
Existential introduction comes with a new typing rule:
This typing rule says that a package has an existential type if its contents are a well typed instantiation of that existential.
- how do we use these packages?
- we need a way to access the inside of a package—an elimination form
- we'll add additional syntax for unpacking a package
- \(\unpack{\underbrace{\alpha}_{\text{binding of hidden type}}}{\underbrace{x}_{\text{binding of implementation}}}{\underbrace{e}_{\text{package expr.}}}{\underbrace{e'}_{\text{expr. using package}}}\)
- we can use the unpacking syntax to write functions that depend on modules, while maintaining the abstract nature of the hidden type
Existential elimination also has a new typing rule:
This rule says that if we unpack a package then we can use its interface type to check the expression \(e'\). Note that the hidden type is never exposed to \(e'\). Instead, we extend the typing context \(\Gamma\) with the type variable \(\alpha\) and the interface type \(\tau_{\mathrm{intf}}\), which may refer to \(\alpha\). \(e'\) must type check without being able to see through the abstraction boundary.
The typing rules for existentials may remind you of the rules for universals:
The rules have an interesting duality: T-TypeAbs extends the typing context in the same way as T-Unpack; T-TypeApp performs substitution, similar to T-Pack.
Existentials have three associated computation rules:
The E-Pack and E-Unpack rules tell us that the insides of existential packages must be evaluated before they can be used.
The E-Package rule says that we can unpack and use a package value by substituting its value part followed by its type part into the consuming expression \(e'\). This rule should remind you of the rules for function application and let bindings. Note that we are actually substituting the "hidden type" into the consuming expression. We rely on the typing rules to ensure that \(e'\) can't actually see \(\tau_{\mathrm{hidden}}\). This is similar to the way that the rules for universal types ensure that universally quantified expressions can't depend on the concrete type that they are instantiated with.
Abstract data types
An abstract data type (ADT) provides consumers with an interface for manipulating some data type, while not exposing the implementation of that type. An ADT includes a private, concrete representation of the data type which the implementation can see but the consumers cannot. Consumers of the ADT are only allowed to see the interface; all operations on the abstract type must be performed by one of the ADT operations.
The existential types we have just discussed give us a straightforward way to implement ADTs in the polymorphic lambda calculus.
Here's an example written in OCaml. (This might look like what we did in the last lecture, but here the type system is doing a lot more work for us.)
open Printf
module type COUNTER = sig
type t
val zero : t
val incr : t -> t
val get : t -> int
end
module IntCounter : COUNTER = struct
type t = int
let zero = 0
let incr x = x + 1
let get x = x
end
module NatCounter : COUNTER = struct
type t = Z | S of t
let zero = Z
let incr x = S x
let rec get = function
| Z -> 0
| S x -> 1 + get x
end
let do_counting (module C : COUNTER) =
let c = C.zero in
let c = C.incr c in
let c = C.incr c in
printf "Counter value: %d\n%!" (C.get c)
let () = do_counting (module NatCounter : COUNTER)
let () = do_counting (module IntCounter : COUNTER)
Counter value: 2 Counter value: 2 module type COUNTER = sig type t val zero : t val incr : t -> t val get : t -> int end module IntCounter : COUNTER module NatCounter : COUNTER
When we run the program, we can see that the two calls to do_counting
produce the same answer, despite using different implementations of the counter module.
(The interpreter also prints out the types of the modules that we defined.)
COUNTER
is a module type that is analogous to the following existential type:
The module implementations correspond to the packing operation that we defined:
We're hiding the internal type (int
in this case).
The do_counting
function unpacks the package and uses its contents:
We can check that the do_counting
function is well-typed:
If we try to access the concrete type of the counter, we get a type error:
let access_internal (module C : COUNTER) =
let c = C.zero + 1 in
printf "Counter value: %d\n%!" (C.get c)
let () = access_internal (module IntCounter : COUNTER)
Characters 679-685: let c = C.zero + 1 in ^^^^^^ Error: This expression has type C.t but an expression was expected of type int
Interestingly, we also get a type error if we try to return an instance of the existential type:
let return_internal (module C : COUNTER) =
C.zero
let _ = return_internal (module IntCounter : COUNTER)
Characters 671-677: C.zero ^^^^^^ Error: This expression has type C.t but an expression was expected of type 'a The type constructor C.t would escape its scope
If we revisit the typing rule for unpacking, we see that while the body of the unpack is typed in a context that contains the type variable \(\alpha\) and the existential interface \(\tau_{intf}\), the overall unpack is not typed in this extended context. All of the code that needs to reference the internals of the package must be inside the unpack (or called from inside the unpack with appropriate type arguments).
Existential Objects
Our first attempt to encode objects captured some of their behavior (self-reference and inheritance) but missed other parts (encapsulation). Existential types give us the tools to draw an abstraction boundary around an object.
Using existential types, we can encode objects as a collection of methods that operate on some hidden "state" type. For example, a point object might look like the following:
The state record contains the internal state of the point; the methods record contains observer and producer methods.
As with ADTs, we need to unpack the existential to get to the methods.
However, object methods should not return the bare state type—instead methods like shift
need to return a new point object.
We can write a helper function for shift
that calls the method and performs the repacking:
The key difference between objects and ADTs is that we unpack ADTs once and then we use the ADT methods to manipulate ADT values—objects are repacked after every method call. ADTs are unpacked early and objects are unpacked late. This difference in use gives objects some additional flexibility: it's straightforward for a program to have many different counter objects that have different internal representations. We can treat objects that implement the same interface in a uniform way. While we can unpack two ADTs that implement the same interface, once unpacked they have incompatible types.
On the other hand it's not a problem to implement binary methods (i.e. methods like add
that take two instances of an abstract type) using ADTs, but this form of objects cannot express them.
References
- These notes are based on Ch. 24 of "Types and Programming Languages".