6.S050 Problem Set 2–Semantics

Due: Thu, Mar 9, 2023 @ 11:59 PM

1. Problem Set 1 Peer Feedback (20 points)

You will receive one of your classmates' syntax designs on Canvas. Answer the following questions about their submission. Please keep your feedback respectful and constructive.

  1. Try to write the following query in the proposed syntax:

    "Using the apartment data, list the apartments from largest (square footage) to smallest."

    Give an explanation of how your answer corresponds to the query specification (essentially, explain to your classmate how you think their syntax is intended to be used). If the proposed syntax cannot express this query, suggest an extension that would make it sufficiently expressive.

  2. Are the example queries clear and easy to read? Suggest a modification that you think will improve clarity.
  3. Compare and contrast your version of query #3 with your classmates'. Which of your syntax design choices are similar, and which differ? Did you choose to make different trade-offs with respect to clarity or concision? Remember that they will not be able to see your syntax design, so be sure to give examples of your syntax.

Remember to not put your name on the answer to this question, as it will be given to your classmate.

2. Specifying Language Features Using Operational Semantics (40 points)

In class, we introduced operational semantics using IMP, a simple imperative language. In this problem you will build your skills at writing operational semantics by extending IMP. Write your answers to these problems in the style presented in lecture.

You will be reading each other's answers to this question and giving feedback as part of problem set 3, so don't include your name when you turn in your answers.

For your convenience, here is the syntax and semantics of IMP.

<stmt>  ::= skip
	  | <stmt> ; <stmt>
	  | <var> = <expr>
	  | while <expr> do <stmt>
	  | if <expr> then <stmt> else <stmt>
<value> ::= <int>
	  | <bool>
<expr>  ::= <var>
	  | <value>
	  | <unop> <expr>
	  | <expr> <binop> <expr>
<unop>  ::= - | not
<binop> ::= + | - | * | / | and | or | < | > | ==

The reduction rules are as follows. We've omitted the rules for expressions, but you can assume that they follow the usual semantics of boolean logic and integer arithmetic.

\begin{align*} \newcommand{sconfig}[2][\sigma]{\langle #1, #2 \rangle} \sconfig{\texttt{skip} ; stmt} &\rightsquigarrow \sconfig{stmt} \\ \sconfig{var = value} &\rightsquigarrow \sconfig[{\sigma[var \mapsto value]}]{\texttt{skip}} \\ \sconfig{\texttt{while}\ expr\ \texttt{do}\ stmt} &\rightsquigarrow \sconfig{\texttt{if}\ expr\ \texttt{then}\ (stmt ; \texttt{while}\ expr\ \texttt{do}\ stmt)\ \texttt{else}\ \texttt{skip}} \\ \sconfig{\texttt{if}\ \texttt{true}\ \texttt{then}\ stmt\ \texttt{else}\ stmt'} &\rightsquigarrow \sconfig{stmt} \\ \sconfig{\texttt{if}\ \texttt{false}\ \texttt{then}\ stmt\ \texttt{else}\ stmt'} &\rightsquigarrow \sconfig{stmt'} \end{align*} \begin{prooftree} \AxiomC{$\sigma[var] = value$} \UnaryInfC{$\langle \sigma, var \rangle \rightsquigarrow \langle \sigma, value \rangle$} \end{prooftree}

These rules describe how redexes (reducible expressions) are reduced, and they define the computational core of the language. We've omitted the bar for rules that don't have preconditions. However, they don't specify how to locate the next redex to reduce.

We select the next redex using reduction contexts. Contexts are terms with a single hole (denoted by □). The grammar of contexts for IMP is:

\begin{align*} C &::= □ \mid C ; stmt \mid var = C \mid \texttt{if}\ C\ \texttt{then}\ stmt\ \texttt{else}\ stmt' \mid unop\ C \\ &\mid C\ binop\ expr \mid value\ binop\ C \end{align*}

We define the small-step execution relation using these contexts as follows:

\begin{prooftree} \AxiomC{$\langle \sigma, E \rangle \rightsquigarrow \langle \sigma', E' \rangle$} \UnaryInfC{$\langle \sigma, C[E] \rangle \rightarrow \langle \sigma', C[E'] \rangle$} \end{prooftree}

This rule says that we can take a step of execution by finding a context \(C\), reducing the expression (or statement) inside, and plugging the result back into the context. States of the form \(\langle \sigma, \texttt{skip} \rangle\) are terminal states; execution stops when it reaches one of these states.

We can define the multistep execution relation \(\rightarrow^*\) as follows:

\begin{prooftree} \AxiomC{$\langle \sigma, E \rangle \rightarrow^* \langle \sigma, E \rangle$} \end{prooftree} \begin{prooftree} \AxiomC{$\langle \sigma, E \rangle \rightarrow \langle \sigma', E' \rangle$} \AxiomC{$\langle \sigma', E' \rangle \rightarrow^* \langle \sigma'', E'' \rangle$} \BinaryInfC{$\langle \sigma, E \rangle \rightarrow^* \langle \sigma'', E'' \rangle$} \end{prooftree}

For example, \(\langle \sigma, E \rangle \rightarrow^* \langle \sigma', \texttt{skip} \rangle\) says that the program \(E\) eventually terminates in state \(\sigma'\) if we run it in state \(\sigma\).

2.1. for loops

In this problem, you'll extend IMP to support for loops.

Here's the IMP grammar extended with for loops:

<stmt>  ::= skip
	  | <stmt> ; <stmt>
	  | <var> = <expr>
	  | while <expr> do <stmt>
	  | if <expr> then <stmt> else <stmt>
	  | for <var> from <expr> to <expr> do <stmt>
<value> ::= <int>
	  | <bool>
<expr>  ::= <var>
	  | <value>
	  | <unop> <expr>
	  | <expr> <binop> <expr>
<unop>  ::= - | not
<binop> ::= + | - | * | / | and | or | < | > | ==

Here's an example of for loops in action (we've given them a concrete syntax for convenience):

y = 0;
for x from 1 to (y + 10) {
  y = y + x;
}
// x = 10, y = 55

There are a few interesting design decisions around for loops:

  • What interval should the loop iterate over? \([lower, upper]\)? \([lower, upper)\)?
  • When should the expressions for the lower and upper bounds be evaluated? Before entering the loop? Before every loop iteration? In what order are the bounds evaluated?
  • Is the variable <var> available after the loop terminates? What if the loop performs zero iterations?
  • What happens if <var> is mutated in the loop?

In our version of for loops we'd like:

  • To iterate over \([lower, upper]\).
  • The loop bounds to be evaluated before the loop enters. The lower bound should be evaluated before the upper bound.
  • The loop variable should be available after the loop exits, but only if the loop executed at least one iteration. It should be equal to the loop upper bound.
  • Mutating the loop variable inside the loop should not modify the number of iterations that the loop takes.

Write the semantics of for loops in small-step style. You'll need to extend the grammar of contexts and add at least one new rule to the \(\rightsquigarrow\) relation.

2.2. Parallelism

In this problem, you'll extend IMP to support parallel programming with fork and channels (this is easier than it sounds). The syntax of parallel IMP is as follows:

<stmt>  ::= skip
	  | <stmt> ; <stmt>
	  | <var> = <expr>
	  | while <expr> do <stmt>
	  | if <expr> then <stmt> else <stmt>
	  | fork <stmt>
	  | <channel> <- <expr>
	  | <channel> -> <var>
<value> ::= <int>
	  | <bool>
<expr>  ::= <var>
	  | <value>
	  | <unop> <expr>
	  | <expr> <binop> <expr>
<unop>  ::= - | not
<binop> ::= + | - | * | / | and | or | < | > | ==
<channel> ::= <var>

Parallel IMP adds three new statements:

  • fork <stmt> behaves like Unix fork. It starts a new child thread that inherits the value of variables in the parent thread, but changes to a variable in one thread are not visible to the other thread. Instead, threads communicate with each other using channels.
  • <channel> <- <expr> writes the value of <expr> into <channel>.
  • <channel> -> <var> reads from <channel> into <var>.

Note that channels are synchronous. Channels are not like lists; they do not contain data. A thread that writes into a channel blocks until another thread reads from the same channel. This is the same as the semantics of channels in Go.

The following example program starts two threads that pass the value of x back and forth through the channel ch. One of the threads decrements x before sending it, so both loops eventually terminate.

fork {
  ch -> x;
  while (x > 0 || x == 0) {
    ch <- x; ch -> x
  }
};
x = 4;
ch <- x;
while (x > 0) {
  ch -> x; ch <- x - 1
}

2.2.1. Write the configuration for parallel IMP

Serial IMP has configurations \(\langle \sigma, e \rangle\), where \(\sigma\) is a store (map from variable names to values) and \(e\) is a program. The configuration for parallel IMP needs to keep track of multiple threads of execution, each of which has its own program. Parallel IMP programs cannot see each other's modifications to variables, so each thread also needs its own store. Write the configuration for a parallel IMP program.

2.2.2. Write the semantics of parallel execution

Now we will define a new relation \(\rightarrow_p\) that describes how parallel IMP programs transition from one state to another. For now, ignore the new constructs in parallel IMP. We'll assume that at each step of execution for a parallel IMP program, one thread (chosen nondeterministically) will take an execution step according to the rules for serial IMP. Write a small-step rule that formalizes this behavior. Your rule should use the \(\rightarrow\) relation defined earlier.

Hint: You may find it convenient to define new kinds of contexts as you write these rules.

2.2.3. Write the semantics of fork

fork creates a new thread of execution. The currently executing thread continues execution at the statement after the fork and the new thread starts execution at the statement inside the fork. This new thread starts with the same value of all local variables as its parent thread. When the child thread finishes executing the statement in the fork, it terminates. It does not continue to execute the statements after the fork.

Write a small-step rule that describes the behavior of fork.

2.2.4. Write the semantics of channel read and write

<channel> <- <expr> writes the value of <expr> to the channel <channel>. The writing thread will not continue until another thread reads from the channel with <channel> -> <var>. Reading from a channel blocks until another thread writes into the channel. Channels are not buffered, and do not contain any state; each write is matched with exactly one read.

Write a small-step rule that describes the behavior of channels. Also write the extension to contexts \(C\) that allows the right-hand side of channel writes to be evaluated.

3. Implementing Parallel IMP (40 points)

Implement an interpreter for parallel IMP, using the semantics you defined in the previous problem. These semantics are nondeterministic, so programs may terminate in different states depending on the choices that your interpreter makes. Whenever you need to make a nondeterministic choice, make it randomly.

Note, there is no actual parallelism in this problem. You're really implementing a simulator of your semantics. Simulators of this kind are good starting points for debuggers and other kinds of bug-finding tools.

  1. Fill in the functions in the provided skeleton code that raise NotImplementedError (code, tests).
  2. Write a parallel IMP program that computes the sum of numbers from 1 to 100 using four worker threads. Add a test called test_parallel_sum to the test file that checks that this program behaves as expected.
  3. Write a program that has a race condition (i.e. it has more than one possible final state). Add a test called test_racy that checks that this program terminates in one of its possible final states.

You can run the provided tests by running pytest in the same directory as the tests. This is how we'll run the tests that you turn in as well.

Last updated: 2023-03-06 Mon 16:19