Control
\[ \newcommand{\target}[2]{\class{mathjax-symbol mathjax-symbol-#1}{#2}} \newcommand{\evalStmt}[2]{\langle #1 \rangle \target{evalStmt}{\Downarrow} #2} \newcommand{\evalStmtB}[2]{\langle #1 \rangle \target{evalStmtB}{\Downarrow} \langle #2 \rangle} \newcommand{\evalArrowK}{\target{evalStmtK}{\rightarrow}} \newcommand{\evalStmtK}[2]{\langle #1 \rangle \evalArrowK \langle #2 \rangle} \newcommand{\evalExpr}[2]{\langle #1 \rangle \target{evalExpr}{\Downarrow_{e}} #2} \newcommand{\while}[2]{\mathtt{while}\ #1\ \mathtt{do}\ #2} \newcommand{\if}[3]{\mathtt{if}\ #1\ \mathtt{then}\ #2\ \mathtt{else}\ #3} \newcommand{\updateStore}[3]{#1\target{updateStore}{[}#2 \target{updateStore}{\mapsto} #3\target{updateStore}{]}} \newcommand{\kseqd}{\target{kseqd}{\cdot}} \newcommand{\kstop}{\target{kstop}{\mathsf{Kstop}}} \newcommand{\kwhile}{\target{kwhile}{\mathsf{Kwhile}}} \]
Continuation that represents a while loop and subsequent code.
\(\kwhile\ (\text{Condition})\ (\text{Loop Body})\ (\text{Continuation})\)
Special continuation that represents program termination.
Big-step evaluation relation for statements:
\(\evalStmt{\text{Statement}, \text{Store}}{\text{Store}}\)
Big-step evaluation relation for statements with break
:
\[ \evalStmtB{\text{Statement}, \text{Store}}{\text{Store}, \text{DidBreak?}} \]
Small-step evaluation relation for statements using continuations:
\[ \evalStmtK{\text{Continuation}, \text{Store}}{\text{Continuation}, \text{Store}} \]
Big-step evaluation relation for expressions:
\(\evalExpr{\text{Expression}, \text{Store}}{\text{Value}}\)
\(\sigma[x \mapsto v]\) creates a new store where \(x\) maps to \(v\).
\(s \kseqd k\) is shorthand for \(\mathsf{Kseq}\ s\ k\).
\(s \kseqd k \kseqd k'\) is shorthand for \(\mathsf{Kseq}\ s\ (\mathsf{Kseq}\ k\ k')\).
Structured Control Flow
Motivating questions
- How did we end up with the control constructs we use today (i.e.
while
,for
,if
, etc)? - Why not use more expressive control constructs, like
goto
? - How should we think about non-local control (i.e.
break
, exceptions)?
Shift from unstructured to structured control
The control flow operators that we use in imperative programming languages came from languages developed in the late 1950s and early 1960s. At the time, programming was shifting from using low-level machine oriented assembly languages to high-level languages that could abstract away some of the details of the machine. This shift was driven by a desire for increased programmer productivity, and FORTRAN was the first language to deliver on that promise. Over time, high-level languages displaced assembly.
In 1954, FORTRAN introduced if
statements and do
loops (which are essentially equivalent to for
loops).
ALGOL 60, released in 1960, is often credited with starting the structured programming revolution, because it introduced the notion of a "code block" with its own lexical scope.
ALGOL 60 even supported recursive function calls.
This was a controversial feature at the time because it required the language implementer to add a call stack. Implementers preferred to statically allocate function activation records.
Languages that came after ALGOL and FORTRAN picked up these constructs, and they spread widely.
With these new, more structured control flow features, programmers began to argue for a style of programming that avoids the use of unstructured control flow (i.e. goto
).
Although goto
is highly expressive (it can create any control flow graph), it can be difficult to reason about the resulting program.
In particular, determining the invariants that hold at a particular point in the program's text is a global problem in the presence of goto
.
Structured programming really picked up steam after it was proven that any computable function can be written as a control-flow graph that can be constructed with only while
and if
.
Despite the shift to structured control, there are still a number of "unstructured" control constructs in common use:
- Early returns
- Exceptions
- Break & continue
We want to be able to give these constructs reasonable semantics. We'd also like those semantics to be reasonably concise and straightforward to modify and extend.
Extending IMP
with break
In this section, we'll extend IMP
with a break
statement.
First, we’ll present a semantics that extends the state with a boolean flag that tracks whether a break
has occurred.
Then, we’ll present a version of the semantics that uses continuations, and show that the continuation version is significantly easier to extend.
Big-step semantics for IMP
First, we'll give big-step semantics for IMP
without break
.
Evaluation relations:
- \(\evalStmt{\text{Statement}, \text{Store}}{\text{Store}}\)
- \(\evalExpr{\text{Expression}, \text{Store}}{\text{Value}}\)
Using a flag
We can add break
to IMP
by modifying the evaluation relation for statements as follows:
\[ \evalStmtB{\text{Statement}, \text{Store}}{\text{Store}, \text{DidBreak?}} \]
We now track an additional flag that tells us whether we are currently trying to break out of a loop.
We can update our semantics for IMP
to track this flag:
Note that our semantics got quite a bit longer.
We now have three rules for handling while
and two for sequencing.
In addition, every part of the semantics had to change to handle the break flag.
If we decided that we also wanted to add continue
, we would have to make another significant modification to the semantics.
Using continuations
We can solve some of our issues with extensibility by switching to a continuation based semantics for IMP
.
If we keep around an explicit representation of "what to do next" (i.e. a continuation) we can more easily add language constructs that modify that continuation.
We change our evaluation relation to the following: \[\evalStmtK{\text{Continuation}, \text{Store}}{\text{Continuation}, \text{Store}}\] (We also switch to a small-step presentation, because it's a bit clearer.)
Continuations are defined to be either: the special stop continuation Kstop
, a statement followed by a continuation Kseq
, or a while loop followed by a continuation Kwhile
.
Formally, this is written
\[ k : \text{Continuation} ::= \kstop ~|~ \mathsf{Kseq}\ s\ k ~|~ \kwhile\ e\ s\ k. \]
We also define a shorthand for writing \(\mathsf{Kseq}\): \[\mathsf{Kseq}\ s\ k = s \kseqd k.\]
The semantics of IMP
using continuations are:
The initial state of a program \(p\) is \(\langle p \kseqd \kstop, \emptyset \rangle\), and final states are of the form \(\langle \kstop, \sigma \rangle\). We define a multistep evaluation relation \(\rightarrow^{*}\) as usual: \[ \frac{ \evalStmtK{s \kseqd \kstop, \sigma}{k, \sigma'} \quad \langle k, \sigma' \rangle \rightarrow^* \langle \kstop, \sigma'' \rangle }{ \langle s \kseqd \kstop, \sigma \rangle \rightarrow^* \langle \kstop, \sigma'' \rangle } \]
Adding continue
Earlier we said that adding a continue
statement to the semantics of IMP
that use a break
flag would be difficult.
We can add continue
to the continuation semantics in a very straightforward way:
This rule says that when we see a continue
statement, we skip executing all the statements up to the "continue executing the loop" continuation, and we proceed to execute another iteration of the loop.
Implementation
It's straightforward to translate these rules into a Python implementation.
We can use this implementation to explore the machinery of break
and continue
.
def stmt_step(config):
'''
Take one small step of evaluation using the continuation-based semantics.
Returns the new configuration and the name of the rule used.
'''
(cont, store) = config
match cont:
case KSeq(Seq(lhs, rhs), k): # Sequence rule
return ((KSeq(lhs, KSeq(rhs, k)), store), 'K-Seq')
case KSeq(Assign(var, expr), k): # Assignment rule
return ((k, bind(store, var, expr_eval((expr, store)))), 'K-Assign')
case KSeq(If(cond, then, else_), k): # If rules
match expr_eval((cond, store)):
case Bool(True):
return ((KSeq(then, k), store), 'K-If-true')
case Bool(False):
return ((KSeq(else_, k), store), 'K-If-false')
case KSeq(Skip(), k): # Skip rule
return ((k, store), 'K-Skip')
case KSeq(While(cond, body), k): # While to continuation rule
return ((KWhile(cond, body, k), store), 'K-While-1')
case KWhile(cond, body, k) if expr_eval((cond, store)) == Bool(True):
return ((KSeq(body, KWhile(cond, body, k)), store), 'K-While-2')
case KWhile(cond, body, k) if expr_eval((cond, store)) == Bool(False):
return ((k, store), 'K-While-3')
case KSeq(Break(), KWhile(_, _, k)):
return ((k, store), 'K-Break-1')
case KSeq(Break(), KSeq(_, k)):
return ((KSeq(Break(), k), store), 'K-Break-2')
def initial(stmt):
return (KSeq(stmt, KStop()), {})
def is_final(config):
return config[0] == KStop()
def stmt_run(stmt):
config = initial(stmt)
yield config
while not is_final(config):
(config, rule) = stmt_step(config)
yield (config, rule)
Try editing this example program:
We can see that running this program using our continuation semantics causes the loop to exit early, and the program to terminate with x = 1
.
You can download the full code here.
Continuations
In the previous section, we saw how keeping track of the "rest of the computation to be performed" or continuation gives us the flexibility to add some new control flow constructs without making major changes to the rest of the semantics. In this section, we'll explore the notion of continuation further.
One way we can look at continuations is to consider how they relate to the small-step evaluation machinery that we have already seen.
At each step of small-step evaluation, we search the term being evaluated for the next redex.
For example, in 4 + [5 * 6]
, the redex is in brackets.
We can split the term into a context 4 + □
and redex 5 * 6
.
We can think of the context 4 + □
as one representation of the computation to be performed after reducing the redex.
We'll also often write these continuations using lambda notation (e.g. \(\lambda x. 4 + x\)).
This is the normal continuation; it continues the normal flow of evaluation.
If we were instead evaluating 4 + (1 / 0)
, we might replace the normal continuation with one that terminates execution and prints out an error.
We'll see later how continuations are helpful in describing the semantics of exceptions.
We can get some more intuition about how continuations work by writing out the normal continuation for each step of evaluation.
We'll evaluate a FL
term: (4 * 5) + 3
.
Normal continuation | |
---|---|
(4 * (5 + 1)) + 3 |
\(k_{1} = \lambda x.x\) |
4 * (5 + 1) |
\(k_{2} = \lambda x. k_{1} (x + 3) = \lambda x. x + 3\) |
5 + 1 |
\(k_{3} = \lambda x. k_{2} (4 * x) = \lambda x. (4 * x) + 3\) |
6 |
\(k_{3}\) |
4 * 6 |
\(k_{2}\) |
24 |
\(k_{2}\) |
24 + 3 |
\(k_{1}\) |
27 |
\(k_{1}\) |
Let's try this again, but with a recursive function. We'll evaluate the following:
let rec factorial n = if n <= 1 then 1 else n * factorial (n - 1) factorial 3
Normal continuation | |
---|---|
factorial 3 |
\(k_{1} = \lambda x. x\) |
if 3 <= 1 then 1 else 3 * factorial (3 - 1) |
\(k_{1}\) |
3 * factorial (3 - 1) |
\(k_{1}\) |
factorial (3 - 1) |
\(k_{2} = \lambda x. k_{1} (3 * x)\) |
3 - 1 |
\(k_{3} = \lambda x. k_{2} (factorial\ x)\) |
2 |
\(k_{3}\) |
factorial 2 |
\(k_{2}\) |
if 2 <= 1 then 1 else 2 * factorial (2 - 1) |
\(k_{2}\) |
factorial (2 - 1) |
\(k_{4} = \lambda x. k_{2} (2 * x)\) |
2 - 1 |
\(k_{5} = \lambda x. k_{4} (factorial\ x)\) |
1 |
\(k_{5}\) |
factorial 1 |
\(k_{4}\) |
if 1 <= 1 then 1 else 1 * factorial (1 - 1) |
\(k_{4}\) |
1 |
\(k_{4}\) |
2 * 1 |
\(k_{2}\) |
2 |
\(k_{2}\) |
3 * 2 |
\(k_{1}\) |
6 |
\(k_{1}\) |
One way to think about this normal continuation is that it keeps track of the call stack. If we expand out \(k_{4} = \lambda x. 3 * (2 * x)\), we see that the continuation holds the computation to be performed as the recursive calls return.
Continuation-passing style
Continuation-passing style (CPS) makes control flow explicit by representing the continuation as a function. We contrast CPS with direct style, which leaves the continuation implicit (direct style is the normal mode of programming for most people in most languages).
Here's a simple example of functions written in direct style.
// Functions written in direct style
function callee(x) {
return x + 1;
}
function caller() {
return callee(5) * 2;
}
console.log(caller());
12
And those same functions written in continuation-passing style.
// Functions written in continuation-passing style
function callee(x, k) {
k(x + 1);
}
function caller(k) {
callee(5, function(ret) {
k(ret * 2);
})
}
caller(console.log);
12
Note the difference in how we call caller
.
In direct style, we can call caller
, receive its return value, and pass that value to console.log
.
In CPS, we pass console.log
to caller
, because console.log
is how we want to consume the value that caller
produces.
The main rule of continuation passing style is that rather than returning, functions call their continuation with the value that they would have returned. Therefore, instead of receiving a value by calling a function and waiting for it to return, we must pass the function a lambda that takes the return value as an argument and continues the computation.
Factorial example
Here's another example. We'll transform this factorial function, written in direct style, into a version in continuation-passing style.
function factorial(n) {
if (n == 0) { return 1; }
else { return n * factorial(n - 1); }
}
console.log(factorial(5));
120
It's a bit clearer to see how the CPS version corresponds to the direct style version if we refactor slightly first. We'll extract out all the subexpressions and give them names.
function factorial(n) {
if (n == 0) { return 1; }
else {
let x = n - 1;
let r = factorial(x);
return n * r;
}
}
console.log(factorial(5));
120
We can see that the CPS version replaces the first instance of return
with a call to the continuation k
.
The function call does not return; instead, we pass a continuation that consumes the "return value" r
.
We don't return from that continuation either.
Instead, we pass the result on to k
.
function factorial_cps(n, k) {
if (n == 0) { k(1); }
else {
let x = n - 1;
factorial_cps(x, function(r) {
k(n * r);
});
}
}
factorial_cps(5, console.log);
120
Non-local exits
Imagine that we are performing some computation, and we want to be able to finish the computation early. In this example, we have a tree with integers at the leaves, and we want to take a product of the leaf values. Recursion is the most natural way to write this program:
let tree = {
left: 3,
right: {
left: 1,
right: 2
}
};
function tree_product(t) {
if (typeof t === 'number') {
return t;
} else {
return tree_product(t.left) * tree_product(t.right);
}
}
console.log(tree_product(tree));
6
We might like to finish this function early if one of the leaves is zero, to avoid looking at the rest of the tree. We could rewrite our function to use iteration instead of recursion, so we could use an early return, but we have another option.
If we instead translate our code into CPS, adding an early return is straightforward, even one that returns from deep in a stack of recursive calls.
function tree_product_cps(t, k) {
console.log('tree product', t);
if (typeof t === 'number') {
k(t);
} else {
tree_product_cps(t.left, function(left) {
tree_product_cps(t.right, function(right) {
k(left * right);
})
});
}
}
tree_product_cps(tree, console.log);
tree product { left: 3, right: { left: 1, right: 2 } } tree product 3 tree product { left: 1, right: 2 } tree product 1 tree product 2 6
Here's the version that returns early. We pass an additional "early return" continuation, which allows us to avoid looking at the remaining tree once we have seen a zero.
let tree = {
left: {
left: 0,
right: 2
},
right: {
left: 1,
right: 3
}
};
function tree_product_cps(t, k_early_return, k) {
console.log('tree product', t);
if (typeof t === 'number') {
if (t === 0) {
k_early_return(t);
} else {
k(t);
}
} else {
tree_product_cps(t.left, k_early_return, function(left) {
tree_product_cps(t.right, k_early_return, function(right) {
k(left * right);
})
});
}
}
tree_product_cps(tree, console.log, console.log);
tree product { left: { left: 0, right: 2 }, right: { left: 1, right: 3 } } tree product { left: 0, right: 2 } tree product 0 0
Exceptions
Exceptions are a close cousin of non-local exits.
We could use the early exit continuation as a "failure continuation," and that would let us write programs that fail.
If we want more standard exception handling with try/catch
and throw
, we have to do a little bit of additional work.
The following example shows how we could write a simple version of try/catch
and throw
for programs in CPS.
The try_catch
function takes two functions---body
and on_exn
—and a continuation k
.
Execution begins with body
(the try clause of the try-catch), but if throw_
is called inside body
, execution jumps to on_exn
(the catch clause).
Execution continues with k
after either the body completes or it raises an exception.
We use a stack of exception handlers handlers
to keep track of catch clause to execute.
When we enter a try-catch, we push the handler onto the stack.
When we finish the body of the try-catch, we pop the handler off and continue.
When throw_
is called, we discard whatever computation remains after throw_
and call the topmost handler, popping it off the stack.
let handlers = [];
function try_catch(body, on_exn, k) {
handlers.push(function () {on_exn(k);});
body(function (x) {
handlers.pop();
k(x);
});
}
function throw_(k) {
handlers.pop()();
}
try_catch(function(k) {
console.log('Starting computation.');
try_catch(function(k) {
console.log('Trying out nested exceptions...');
throw_(k);
}, function(k) {
console.log('Caught nested exception. Ignoring.');
k();
}, function(k) {
console.log('Starting actual work.');
tree_product_cps(tree, throw_, k);
});
}, function(k) {
console.log('Caught an exception! Returning dummy value -1.');
k(-1);
}, function(prod) {
console.log(`Computed product ${prod}`);
})
Starting computation. Trying out nested exceptions... Caught nested exception. Ignoring. Starting actual work. tree product { left: { left: 0, right: 2 }, right: { left: 1, right: 3 } } tree product { left: 0, right: 2 } tree product 0 Caught an exception! Returning dummy value -1. Computed product -1
This implementation of exceptions doesn't support throwing and catching named exceptions (e.g. RuntimeError
, SyntaxError
, etc).
We can add this feature in a straightforward way by adding a new handler stack for each kind of exception.
Mechanization
You might have noticed from the previous examples that the translation to CPS is fairly mechanical, and in fact it is straightforward to transform a piece of code into CPS entirely automatically. The rules for this transformation on the (call-by-value) lambda calculus are as follows:
At a high level, programs are transformed into functions that take a continuation \(k\) and apply \(k\) to the value that the program would have returned. As in the examples above, calling \(k\) is analogous to returning a value.
If we consider the rules one at a time, we see that:
- The rule for translating a variable \(x\) takes the continuation \(k\) and applies it to \(x\).
- The rule for translating a lambda first CPS-converts the inside of the lambda, then passes the result to \(k\).
The rule for translating function application is a little more interesting. We can make it slightly clearer by rewriting it to use
let
bindings (remember that when we talked about desugaring, we mentioned that function application and non-recursivelet
bindings can be rewritten into each other).\begin{align*} &CPS(f\ e) =\\ &\quad\lambda k.\\ &\quad\quad\mathsf{let}\ f' = CPS(f)\ \mathsf{in}\\ &\quad\quad\mathsf{let}\ e' = CPS(e)\ \mathsf{in}\\ &\quad\quad f\ e\ k \end{align*}After reframing the rule, it becomes clear that we are CPS converting \(f\) and \(e\), applying \(f\) to \(e\), and continuing with \(k\).
We can get some confidence that these rules work by manually converting a term.
Further reading
- Appel & Blazy, "Separation Logic for Small-step Cminor"
- A good example of operational semantics written in continuation passing style, and the inspiration for the
break
andcontinue
example.
- A good example of operational semantics written in continuation passing style, and the inspiration for the
- More discussion of continuation-passing style and CPS conversion:
- The examples in the Continuations and Continuation-passing style sections come from Ch. 9 of Design Concepts in Programming Languages.
- Oleg Kiselyov's entire page on delimited control & continuations is worth a read. (Or just read the whole site… it's a PL goldmine.)