Interfaces, abstraction, and information hiding

\[ \newcommand{\target}[2]{\class{mathjax-symbol mathjax-symbol-#1}{#2}} \newcommand{\type}[3]{#1 \target{type}{\vdash} #2 \target{type}{:} #3} \newcommand{\abs}[3]{\target{abs}{\lambda} #1 \target{abs}{:} #2\target{abs}{.} #3} \newcommand{\tabs}[2]{\target{tabs}{\Lambda} #1 \target{tabs}{.} #2} \newcommand{\arrow}{\target{arrowT}{\rightarrow}} \newcommand{\subt}{\target{subT}{<:}} \newcommand{\Int}{\mathsf{Int}} \newcommand{\Bool}{\mathsf{Bool}} \newcommand{\lett}[3]{\mathsf{let}\ \mathit{#1} : #2 = #3\ \mathsf{in}} \newcommand{\letu}[2]{\mathsf{let}\ \mathit{#1} = #2\ \mathsf{in}\ } \newcommand{\ite}[3]{\mathsf{if}\ #1\ \mathsf{then}\ #2\ \mathsf{else}\ #3} \DeclareMathOperator{\proj}{.} \newcommand{\object}[1]{\mathsf{object}\ #1\ \mathsf{end}} \newcommand{\call}[2]{#1 \# \mathit{#2}} \newcommand{\method}[2]{\mathsf{method}\ \mathit{#1} = #2} \newcommand{\self}{\mathit{self}} \newcommand{\super}{\mathit{super}} \newcommand{\refc}{\mathsf{ref}\ } \newcommand{\inherit}[2]{\mathsf{inherit}\ #1\ \mathsf{as}\ \mathit{#2}} \newcommand{\pack}[3]{\mathsf{pack}\ (#1, #2)\ \mathsf{as}\ #3} \newcommand{\unpack}[4]{\mathsf{unpack}\ (#1, #2) = #3\ \mathsf{in}\ #4} \newcommand{\override}{\ \mathsf{override}\ } \newcommand{\loc}[1]{l_{#1}} \]

Typing relation. Usually written \(\type{\Gamma}{e}{\tau}\), where \(\Gamma\) is a map from identifiers to types, \(e\) is an expression, and \(\tau\) is a type.

\(\type{\mathrm{Type Context}}{\mathrm{Expression}}{\mathrm{Type}}\)

Anonymous function.

\(\abs{\mathrm{Parameter Name}}{\mathrm{Parameter Type}}{\mathrm{Body}}\)

Type abstraction.

\(\tabs{\mathrm{Type Variable}}{\mathrm{Body}}\)

Function type.

\(\mathrm{Parameter Type} \arrow \mathrm{Return Type}\)

Subtyping relation.

Read \(\tau \subt \tau'\) as: \(\tau\) is a subtype of \(\tau'\). If \(\tau\) and \(\tau'\) are records, \(\tau\) has a superset of the fields in \(\tau'\). Alternatively, anything that expects a \(\tau'\) can take a \(\tau\).

\[ \newcommand{\zero}{\mathrm{zero}} \newcommand{\incr}{\mathrm{incr}} \]

Existential Types

  • we've discussed universally quantified types as a way to state that a function can accept parameters of any type—its behavior does not depend on the concrete types that it is passed
  • we might wonder whether existential types have their own uses?
  • intuition
    • \(\forall \alpha. \tau\) is a mapping from types \(\sigma\) to values having type \(\tau[\alpha \mapsto \upsilon]\)
    • \(\exists \alpha. \tau\) says that there exists some \(\sigma\) such that the term has type \(\tau[\alpha \mapsto \sigma]\)
      • operationally, we can treat a value of type \(\exists \alpha. \tau\) as a pair \((e, \sigma)\) of a type \(\sigma\) and a term \(e : \tau[\alpha \mapsto \sigma]\)
  • a term with an existential type can be thought of as a simple module or package that has a single hidden type
    • for example, we could write a package
      • \((\Int, \{ \zero = 0, \incr = \lambda x:\Int. x + 1 \})\)
      • having type \(\exists \alpha. \{ \zero : \alpha, \incr : \alpha \rightarrow \alpha \}\)
    • of course, we can give this module multiple other types
      • \(\exists \alpha. \{ \zero : \Int, \incr : \alpha \rightarrow \alpha \}\)
      • \(\exists \alpha. \{ \zero : \Int, \incr : \Int \rightarrow \alpha \}\)
      • etc
    • we get to choose how much information about the module we keep hidden
      • we choose the level of abstraction in the interface
      • also means that we need to annotate this type, because we can't reasonably infer it
    • therefore, we need to make this type annotation part of the language syntax:
      • \(\pack{\underbrace{\Int}_{\text{hidden type}}}{\underbrace{\{ \zero = 0, \incr = \lambda x:\Int. x + 1 \}}_{\text{package implementation}}}{\underbrace{\exists \alpha. \{ \zero : \alpha, \incr : \alpha \rightarrow \alpha \}}_{\text{package interface}}}\)
        • (this is the introduction form for existential types—it's the only way to make a value having an existential type)

Existential introduction comes with a new typing rule:

\begin{gather} \tag{T-Pack} \frac{ \type{\Gamma}{e}{\tau_{\mathrm{intf}}[\alpha \mapsto \tau_{\mathrm{hidden}}]} }{ \type{\Gamma}{\pack{\tau_{\mathrm{hidden}}}{e}{\exists \alpha. \tau_{\mathrm{intf}}}}{\exists \alpha. \tau_{\mathrm{intf}}} } \end{gather}

This typing rule says that a package has an existential type if its contents are a well typed instantiation of that existential.

  • how do we use these packages?
  • we need a way to access the inside of a package—an elimination form
    • we'll add additional syntax for unpacking a package
    • \(\unpack{\underbrace{\alpha}_{\text{binding of hidden type}}}{\underbrace{x}_{\text{binding of implementation}}}{\underbrace{e}_{\text{package expr.}}}{\underbrace{e'}_{\text{expr. using package}}}\)
  • we can use the unpacking syntax to write functions that depend on modules, while maintaining the abstract nature of the hidden type

Existential elimination also has a new typing rule:

\begin{gather} \tag{T-Unpack} \frac{ \type{\Gamma}{e}{\exists \alpha. \tau_{\mathrm{intf}}} \quad \type{\Gamma, \alpha, x:\tau_{\mathrm{intf}}}{e'}{\tau'} }{ \type{\Gamma}{\unpack{\alpha}{x}{e}{e'}}{\tau'} } \end{gather}

This rule says that if we unpack a package then we can use its interface type to check the expression \(e'\). Note that the hidden type is never exposed to \(e'\). Instead, we extend the typing context \(\Gamma\) with the type variable \(\alpha\) and the interface type \(\tau_{\mathrm{intf}}\), which may refer to \(\alpha\). \(e'\) must type check without being able to see through the abstraction boundary.

The typing rules for existentials may remind you of the rules for universals:

\begin{gather} \tag{T-TypeAbs} \frac{\type{\Gamma, \alpha}{e}{\tau}}{\type{\Gamma}{\tabs{\alpha}{e}}{\forall \alpha.\ \tau}}\\[2ex] \tag{T-TypeApp} \frac{\type{\Gamma}{e}{\forall \alpha.\ \tau}}{\type{\Gamma}{e[\tau']}{\tau[\alpha \mapsto \tau']}} \\[2ex] \end{gather}

The rules have an interesting duality: T-TypeAbs extends the typing context in the same way as T-Unpack; T-TypeApp performs substitution, similar to T-Pack.

Existentials have three associated computation rules:

\begin{gather} \tag{E-Pack} \frac{e \Downarrow v}{\pack{\tau_{\mathrm{hidden}}}{e}{\tau} \Downarrow \pack{\tau_{\mathrm{hidden}}}{v}{\tau}} \\[2ex] \tag{E-Unpack} \frac{e \Downarrow v}{\unpack{\alpha}{x}{e}{e'} \Downarrow \unpack{\alpha}{x}{v}{e'}} \\[2ex] \tag{E-Package} \unpack{\alpha}{x}{\pack{\tau_{\mathrm{hidden}}}{v}{\tau}}{e'} \Downarrow (e'[x \mapsto v])[\alpha \mapsto \tau_{\mathrm{hidden}}] \end{gather}

The E-Pack and E-Unpack rules tell us that the insides of existential packages must be evaluated before they can be used.

The E-Package rule says that we can unpack and use a package value by substituting its value part followed by its type part into the consuming expression \(e'\). This rule should remind you of the rules for function application and let bindings. Note that we are actually substituting the "hidden type" into the consuming expression. We rely on the typing rules to ensure that \(e'\) can't actually see \(\tau_{\mathrm{hidden}}\). This is similar to the way that the rules for universal types ensure that universally quantified expressions can't depend on the concrete type that they are instantiated with.

Abstract data types

An abstract data type (ADT) provides consumers with an interface for manipulating some data type, while not exposing the implementation of that type. An ADT includes a private, concrete representation of the data type which the implementation can see but the consumers cannot. Consumers of the ADT are only allowed to see the interface; all operations on the abstract type must be performed by one of the ADT operations.

The existential types we have just discussed give us a straightforward way to implement ADTs in the polymorphic lambda calculus.

Here's an example written in OCaml. (This might look like what we did in the last lecture, but here the type system is doing a lot more work for us.)

open Printf

module type COUNTER = sig
  type t
  val zero : t
  val incr : t -> t
  val get : t -> int
end

module IntCounter : COUNTER = struct
  type t = int
  let zero = 0
  let incr x = x + 1
  let get x = x
end

module NatCounter : COUNTER = struct
  type t = Z | S of t
  let zero = Z
  let incr x = S x
  let rec get = function
    | Z -> 0
    | S x -> 1 + get x
end

let do_counting (module C : COUNTER) =
  let c = C.zero in
  let c = C.incr c in
  let c = C.incr c in
  printf "Counter value: %d\n%!" (C.get c)

let () = do_counting (module NatCounter : COUNTER)
let () = do_counting (module IntCounter : COUNTER)
Counter value: 2
Counter value: 2
module type COUNTER =
  sig type t val zero : t val incr : t -> t val get : t -> int end
module IntCounter : COUNTER
module NatCounter : COUNTER

When we run the program, we can see that the two calls to do_counting produce the same answer, despite using different implementations of the counter module. (The interpreter also prints out the types of the modules that we defined.)

COUNTER is a module type that is analogous to the following existential type:

\begin{align*} &\mathsf{Counter} = \exists \alpha. \{ \\ &\quad \zero : \alpha \\ &\quad \incr : \alpha \rightarrow \alpha \\ &\quad \mathrm{get} : \alpha \rightarrow \Int \\ &\} \end{align*}

The module implementations correspond to the packing operation that we defined:

\begin{align*} &\mathrm{IntCounter} = \pack{\Int}{\{ \\ &\quad \zero = 0, \\ &\quad \incr = \lambda x: \Int. x + 1, \\ &\quad \mathrm{get} = \lambda x: \Int. x \\ &\}}{\mathsf{Counter}} \end{align*}

We're hiding the internal type (int in this case).

The do_counting function unpacks the package and uses its contents:

\begin{align*} &\mathrm{docounting} = \\ &\quad \lambda p : \mathsf{Counter}. \\ &\quad\quad \unpack{\alpha}{m}{p}{\\ &\quad\quad \lett{c}{\alpha}{m \proj \zero}\\ &\quad\quad \lett{c}{\alpha}{{m \proj \incr}\ c}\\ &\quad\quad \lett{c}{\alpha}{{m \proj \incr}\ c}\\ &\quad\quad \dots } \end{align*}

We can check that the do_counting function is well-typed:

# ∃ a. {zero: a, incr: a -> a, get: a -> Int} counter_intf = \ ExistsT("a", RecordT({"zero": VarT("a"), "incr": ArrowT(VarT("a"), VarT("a")), "get": ArrowT(VarT("a"), IntT())})) # pack (Int, {zero: Int, incr: Int -> Int, get: Int -> Int}) as counter_intf int_counter = \ Pack(IntT(), Record({"zero": Int(0), "incr": Lam("x", IntT(), Var("x") + Int(1)), "get": Lam("x", IntT(), Var("x"))}), counter_intf) def let(x, t, e1, e2): """Syntactic sugar for let x:t = e1 in e2""" return App(Lam(x, t, e2), e1) # λ c:counter_intf. # let (a, m) = unpack c in # let c = m.zero in # let c = m.incr c in # m.get c do_counting = \ Lam("c", counter_intf, Unpack("m", "a", Var("c"), let("c", VarT("a"), Proj(Var("m"), "zero"), let ("c", VarT("a"), App(Proj(Var("m"), "incr"), Var("c")), App(Proj(Var("m"), "get"), Var("c")))))) exists_type_deriv({}, do_counting)

Help

If we try to access the concrete type of the counter, we get a type error:


let access_internal (module C : COUNTER) =
  let c = C.zero + 1 in
  printf "Counter value: %d\n%!" (C.get c)

let () = access_internal (module IntCounter : COUNTER)
Characters 679-685:
    let c = C.zero + 1 in
            ^^^^^^
Error: This expression has type C.t but an expression was expected of type
         int

Interestingly, we also get a type error if we try to return an instance of the existential type:


let return_internal (module C : COUNTER) =
  C.zero

let _ = return_internal (module IntCounter : COUNTER)
Characters 671-677:
    C.zero
    ^^^^^^
Error: This expression has type C.t but an expression was expected of type 'a
       The type constructor C.t would escape its scope

If we revisit the typing rule for unpacking, we see that while the body of the unpack is typed in a context that contains the type variable \(\alpha\) and the existential interface \(\tau_{intf}\), the overall unpack is not typed in this extended context. All of the code that needs to reference the internals of the package must be inside the unpack (or called from inside the unpack with appropriate type arguments).

Existential Objects

Our first attempt to encode objects captured some of their behavior (self-reference and inheritance) but missed other parts (encapsulation). Existential types give us the tools to draw an abstraction boundary around an object.

Using existential types, we can encode objects as a collection of methods that operate on some hidden "state" type. For example, a point object might look like the following:

\begin{align*} &\mathsf{Point} = \exists \sigma. \{ \mathrm{state} : \sigma, \mathrm{methods} : \{ x : \sigma \rightarrow \Int, y : \sigma \rightarrow \Int, \mathrm{shift} : \sigma \rightarrow \Int \rightarrow \Int \rightarrow \sigma \}\} \\ &\pack{\{ x : \Int, y : \Int \}}{\{ \\ &\quad\mathrm{state} = \{ x = 3, y = 7 \},\\ &\quad\mathrm{methods} = \{\\ &\quad\quad\mathrm{x} = \lambda s:\{ x : \Int, y : \Int \}. s \proj x, \\ &\quad\quad\mathrm{y} = \lambda s:\{ x : \Int, y : \Int \}. s \proj y, \\ &\quad\quad\mathrm{shift} = \lambda s:\{ x : \Int, y : \Int \}. \lambda dx:\Int . \lambda dy:\Int . \{ x = s \proj x + dx, y = s \proj y + dy \} \\ &\quad\}\\ &\}}{\mathsf{Point}} \end{align*}

The state record contains the internal state of the point; the methods record contains observer and producer methods.

As with ADTs, we need to unpack the existential to get to the methods. However, object methods should not return the bare state type—instead methods like shift need to return a new point object. We can write a helper function for shift that calls the method and performs the repacking:

\begin{align*} &\mathrm{callshift} = \lambda p:\mathsf{Point}. \lambda dx:\Int . \lambda dy:\Int. \\ &\quad\unpack{\sigma}{\mathrm{this}}{p}{\\ &\quad\pack{\sigma}{\{ \begin{aligned}[t] &\mathrm{state} = \mathrm{this} \proj \mathrm{methods} \proj \mathrm{shift}\ \mathrm{this} \proj \mathrm{state}\ dx\ dy,\\ & \mathrm{methods} = \mathrm{this} \proj \mathrm{methods} \end{aligned} \}}{\mathsf{Point}}} \end{align*}

The key difference between objects and ADTs is that we unpack ADTs once and then we use the ADT methods to manipulate ADT values—objects are repacked after every method call. ADTs are unpacked early and objects are unpacked late. This difference in use gives objects some additional flexibility: it's straightforward for a program to have many different counter objects that have different internal representations. We can treat objects that implement the same interface in a uniform way. While we can unpack two ADTs that implement the same interface, once unpacked they have incompatible types.

On the other hand it's not a problem to implement binary methods (i.e. methods like add that take two instances of an abstract type) using ADTs, but this form of objects cannot express them.

References

  • These notes are based on Ch. 24 of "Types and Programming Languages".

Last updated: 2023-05-08 Mon 10:54