6.S050 Problem Set 5: Types

\[ \newcommand{\type}[3]{#1 \vdash #2 : #3} \newcommand{\subt}{<:} \]

1. Peer Feedback on Problem Set 4 (20 points)

Respond to the following prompts about your peer's submission and post your response as a comment on Canvas. You can find your response to review on the right column of the Continuation Semantics assignment.

Pick one aspect of their formalism that you think is particularly clear or elegant, and explain why.
Pick one aspect of their formalism that you found confusing or verbose, and suggest an improvement. Feel free to share what you chose to do in your formalism.

2. Types (40 points)

For each of the following terms, annotate it with types (i.e. put a type annotation on the argument of each lambda function) so that it is well-typed in the simply typed, polymorphic, and subtyping lambda calculi. For parametric polymorphism, also insert any necessary type applications. Also write down the type of the whole term.

For example, if you are annotating \(\lambda x. x\), you could write:
- Simple: \(\lambda x:\mathsf{Int}.\ x\) has type \(\mathsf{Int} \rightarrow \mathsf{Int}\)
- Polymorphic: \(\Lambda \alpha.\ \lambda x:\alpha.\ x\) has type \(\forall \alpha.\ \alpha \rightarrow \alpha\)
- Subtype: \(\lambda x:\top.\ x\) has type \(\top \rightarrow \top\)
If the term is not typeable in one of the systems, explain why not.
1. \(\lambda f. \lambda g. f\ (g\ f)\)
2. \(\lambda f. \lambda x. (f\ (\lambda g. g\ x))\ x\)
3. \(\lambda f. \lambda g. \lambda x.\ \{ a = f\ x, b = g\ x \}\)
4. \(\lambda f. f\ \{ x = 1, y = \{ z = 0 \} \} + f\ \{ x = 2, y = \{ z = 1, a = 2 \} \}\)
Note: Although we did not discuss it in class, the notes extend simple and polymorphic types to handle records, so just because a program uses records does not mean that it has no type in these systems. (The type checker in the notes might be helpful for checking your work.)
For each of the following types, give a term that has that type, using the rules for polymorphic typing. If no term has the type, write "none". Some types have more than one term, but you only need to give one.
1. \(\forall \alpha. \forall \beta. (\alpha \rightarrow \alpha \rightarrow \beta) \rightarrow \beta\)
2. \(\forall \alpha. (\forall \beta. \beta \rightarrow \alpha) \rightarrow \alpha\)
3. \(\forall \alpha. \alpha \rightarrow (\forall \beta. \alpha \rightarrow \beta)\)
4. \(\forall \alpha. \alpha \rightarrow (\forall \beta. \beta \rightarrow \alpha)\)
5. \(\forall \alpha. (\forall \beta. \forall \gamma. \beta \rightarrow \gamma) \rightarrow \alpha\)
Assume that \(s \subt u\) and \(u \subt t\). Which of the following are valid under the subtyping relation?
1. \(t \rightarrow u \subt u \rightarrow t\)
2. \(u \rightarrow t \subt t \rightarrow u\)
3. \(\{ x : s, y : t, z : u \} \subt \{ x : t, z : t \}\)
4. \(\{ x : u, y : u \} \rightarrow \{ z : t \} \subt \{ x : s, y : u \} \rightarrow \{ z : u \}\)
Suppose we wanted to add tuples (i.e. product types) to the lambda calculus. We add a type constructor for pairs of types \(\tau\) and \(\tau'\), written \(\tau \times \tau'\). It's reasonable to extend the subtyping relation with the following rule: \[ \frac{\tau_{1} \subt \tau_{1}' \quad \tau_{2} \subt \tau_{2}'}{\tau_{1} \times \tau_{2} \subt \tau_{1}' \times \tau_{2}'} \] Should we add the following rule as well? \[ \tau_1 \times \tau_2 \subt \tau_1 \] What would this rule mean for the runtime semantics of our program?

3. Expressing Program Invariants in Types (40 points)

As language designers, we can use type systems to give programmers tools to check domain-specific invariants.

Here's a simple example. Imagine that we are writing a library for working with SQL queries and we need to build queries that contain user input. We receive a string s from the user, call a function build_sql(s) that produces a SQL query, and run the query. However, if we're not careful about how we build the query, a malicious user might be able to cause us to generate harmful queries.

For example, if build_sql looks like this:

def build_sql(username):
    return f"select * from users where username = '{username}'"

A user might give us the following input: "eviluser'; drop table users;--". When we generate the query, we get:

select * from users where username = 'eviluser'; drop table users;--'

Which destroys our database.

The programmer should have escaped the input string by removing the ' character.

We can ensure that programmers always remember to escape strings by providing the following interface in our language. We add two kinds of string type: the unescaped user input \(\mathsf{DirtyString}\) and the escaped (or at least not read from the user) \(\mathsf{String}\). The run-sql function only accepts \(\mathsf{String}\) input, not \(\mathsf{DirtyString}\) input.

\begin{align*} \text{read-string} &: \mathsf{Int} \rightarrow \mathsf{DirtyString} \\ \text{escape} &: \mathsf{DirtyString} \rightarrow \mathsf{String} \\ \text{run-sql} &: \mathsf{String} \rightarrow \mathsf{Result} \end{align*}

As long as the programmer uses this interface, they cannot accidentally run a SQL query that contains malicious user input. On the other hand, programmers give up the ability to use input that contains single quotes without escaping them.

Propose a type system that rejects all programs that divide by zero. Your type system should accept as many programs as possible.

Your type system should be in the style of the lambda calculus with subtyping.

Here's an example answer: We add two new types \(\mathsf{Zero}\) and \(\mathsf{NonZero}\) and extend the subtyping relation so that \(\mathsf{Zero} \subt \mathsf{Int}\) and \(\mathsf{NonZero} \subt \mathsf{Int}\). We replace the rules for integer constants and the division operator with the following:

\begin{gather} \tag{T-Int-Zero} \type{\Gamma}{0}{\mathsf{Zero}} \\[2ex] \tag{T-Int-NonZero} \frac{n \neq 0}{\type{\Gamma}{n}{\mathsf{NonZero}}} \\[2ex] \tag{T-Div} \frac{\type{\Gamma}{e}{t} \quad \type{\Gamma}{e'}{\mathsf{NonZero}}}{\type{\Gamma}{e / e'}{\mathsf{Int}}} \end{gather}

Our example type system is sound (it rejects every program that divides by zero) but not very permissive. \(1 / 2\) is well typed, but \(1 / (2 + 1)\) is not. \(\lambda x:\mathsf{NonZero}.\ 5 / x\) is well-typed, but \(\lambda x:\mathsf{NonZero}.\ 5 / (x * x)\) is not.

For each of the following programs, answer these questions:

Can the program ever divide by zero? If so, give an example input that results in division by zero.
Can you add type annotations to make the program well-typed in your system? If yes, give the annotated program. Are your type annotations overly restrictive (i.e., are there inputs that your program can accept without dividing by zero but are not allowed by your type annotations)? If the program is not typeable, why not?

Programs:

\(\lambda x.\ \lambda y.\ \lambda z.\ (x + y) / (y + z)\)
\(\lambda x.\ \lambda y.\ \lambda z.\ y / (x * z)\)
\(\lambda x.\ \lambda y.\ 1 / (x + y + 2)\)
\(\lambda x.\ \lambda y.\ 1 / (x * x)\)
\(\lambda x.\ \lambda y.\ \mathsf{if}\ x > 1\ \mathsf{then}\ 5 / (x - 1)\ \mathsf{else}\ y\)
\(\lambda f.\ \lambda x.\ \lambda y.\ 1 / ((1 + (f\ y\ x)) * (1 - (f\ x\ y)))\)

Your type system must accept at least the first three programs.

Give a brief (1 paragraph) justification of your design decisions. Is your type system sound? Does it reject every program that divides by zero? Is your type system general? That is, will it work well for programs that are not listed above?

Hints:

Our example type system has types for zero and non-zero values. Would it be helpful to keep track of whether a value is positive or negative?
Programs 4-6 are trickier to capture than the first three.

1. Peer Feedback on Problem Set 4 (20 points)#

2. Types (40 points)#

3. Expressing Program Invariants in Types (40 points)#

1. Peer Feedback on Problem Set 4 (20 points)

2. Types (40 points)

3. Expressing Program Invariants in Types (40 points)