Big-step Operational Semantics
Calc
language
In describing the semantics of a language, we are trying to give a precise description of what programs in that language do: what they evaluate to, and what effects they perform while they are evaluating.
Now we introduce a little calculator language Calc
that we'll use to illustrate the concepts in these notes:
<stmt> ::= <stmt> ; <stmt> | print <expr> | <var> = <expr>
<expr> ::= <int> | <var> | <expr> <binop> <expr>
<binop> ::= + | - | * | /
Calc
programs consist of a sequence of statements, evaluated in order.
Each statement can assign an expression to a variable or print it out.
Here's an example program:
x = 2; y = x * 4; print y; // output: 8 x = y / 2; print (x + 1) * 3; // output: 15
As we work through the material in the lecture, we'll show how these ideas can be made concrete by giving Python code for an interpreter.
In the interpreters we define in this course, we'll represent syntax in Python using dataclasses
(see here for a quick explanation).
The syntax of the Calc
language is defined as follows:
from dataclasses import dataclass
@dataclass
class Expr:
pass
@dataclass
class Var(Expr):
name: str
@dataclass
class Int(Expr):
val: int
@dataclass
class Binop(Expr):
op: str
lhs: Expr
rhs: Expr
@dataclass
class Stmt:
pass
@dataclass
class Print(Stmt):
expr: Expr
@dataclass
class Set(Stmt):
var: str
expr: Expr
@dataclass
class Seq(Stmt):
lhs: Stmt
rhs: Stmt
@dataclass
class Skip(Stmt):
pass
Big-step (natural) operational semantics
We describe the behavior of Calc
programs by defining a relation (called \(\Downarrow\)) between programs and the results of evaluating those programs.
We call this a big-step semantics because evaluation takes place in one large step from program to final value.
We consider programs \(e\) in the context of some store \(\sigma\), where \(\sigma\) is a mapping from variables to values. Our programs can produce output, which we represent as a list \(\omega\) of the integers printed by the program. Our evaluation relation is between pairs \(\langle e, \sigma \rangle\) (program + input store) and \(\langle \sigma', \omega \rangle\) (output store + outputs). You can read \(\langle e, \sigma \rangle \Downarrow \langle \sigma', \omega \rangle\) as "executing \(e\) with variable assignments \(\sigma\) produces new assignments \(\sigma'\) and outputs \(\omega\)."
Conceptually, \(\Downarrow\) is just a set of these transitions from program + input store to output store + outputs. However, for the languages that we care about, it will be an infinite set, so we can't write it down directly. Instead, we will describe \(\Downarrow\) using inference rules, which allow us to prove that a particular transition is in fact a member of \(\Downarrow\).
What does it mean that \(\Downarrow\) is a relation, and not a function? What would we lose if we required it to be a function?
An inference rule can be read as follows: \[ \frac{premise_1\ premise_2\ \dots premise_k}{conclusion}. \] If the premises above the bar hold, then the conclusion holds as well. For example,
\begin{prooftree} \AxiomC{$\langle e, \sigma \rangle \Downarrow_e n$} \RightLabel{B-Print} \UnaryInfC{$\langle \texttt{print}\ e, \sigma \rangle \Downarrow \langle \sigma, [n] \rangle$} \end{prooftree}
says that executing a statement print e
with store \(\sigma\) does not change the store, but produces an output \(n\) if e
evaluates to \(n\).
(You can see that we have a \(\Downarrow_e\) in addition to \(\Downarrow\). Expressions evaluate to a value, and do not produce any side effects, so we give a separate relation from expression + input store to output value).
The following is the big-step semantics for Calc
:
You can see that the semantics are quite concise, although we're taking a slight liberty by giving a single rule for evaluating binary operations rather than a rule for each operator.
Big-step semantic rules correspond more-or-less directly to an interpreter, as follows:
from calc_syntax import *
def big_step_expr(expr, store):
'''Implementation of the =>e relation. Takes an expression + input store and
returns a value.'''
match expr:
case Binop(op, e_lhs, e_rhs): # B-Binop
n_lhs = big_step_expr(e_lhs, store)
n_rhs = big_step_expr(e_rhs, store)
match op:
case "+":
return n_lhs + n_rhs
case "-":
return n_lhs - n_rhs
case "*":
return n_lhs * n_rhs
case "/":
return n_lhs // n_rhs
case Var(v) if v in store: # B-Var
return store[v]
case Int(n): # B-Int
return n
def big_step(stmt, store):
'''Implementation of the => relation. Takes an expression + input store and
returns an output store + outputs.'''
match stmt:
case Seq(lhs, rhs): # B-Seq
(store1, out1) = big_step(lhs, store)
(store2, out2) = big_step(rhs, store1)
return (store2, out1 + out2)
case Print(e): # B-Print
n = big_step_expr(e, store)
return (store, [n])
case Set(v, e): # B-Set
store1 = store.copy()
store1[v] = big_step_expr(e, store)
return (store1, [])
example = Seq(Set("x", Int(2)), Print(Binop("*", Var("x"), Int(4))))
print(big_step(example, {}))
({'x': 2}, [8])
You can download the syntax and big-step semantics implementations.
Further reading
- First paper on big-step semantics, "Natural Semantics" (Gilles Kahn, 1987).