6.S050 Problem Set 3: Functional languages & Naming

1. Problem set 2 peer feedback (20 points)

You will receive one of your classmates' semantics for parallel IMP. Answer the following questions about their submission and post your response as a note on Canvas.

Pick one aspect of their formalism that you think is particularly clear or elegant, and explain why.
Pick one aspect of their formalism (e.g. use/lack of reduction contexts, structure of configuration, etc.) that you found confusing or verbose, and suggest an improvement. Feel free to share what you chose to do in your formalism.

2. Functional languages (40 points)

2.1. Free and bound variables (5 points)

For each expression, indicate whether each identifier is free or bound. If it is bound, draw an arrow from the identifier to its binder.

(fun x -> y (fun y -> y + 1)) y
(f a) + (f (fun f -> f (fun a -> a)))
(fun x -> f x) (fun y -> f y)

2.2. Substitution (15 points)

In lecture, we introduced the idea of capture-avoiding substitution. Our semantics for FL describe function application using substitution, and we talked about the importance of not capturing free variables during substitution. In this problem, you'll investigate the ways in which capture-avoiding substitution can be implemented incorrectly.

Consider the case where we are substituting an expression \(e\) for a variable \(x\) in an abstraction \(\lambda x'.\ e'\) where \(x \neq x'\), written \((\lambda x'. e')[e/x]\). We said that before substitution can occur, we rename \(x'\) to some fresh variable \(x_{fresh}\) that does not appear free in either \(e\) or \(e'\).

What happens if we allow \(x_{fresh}\) to be one of the free variables of \(e\)? Give an example where this causes a variable to be captured.
What happens if we allow \(x_{fresh}\) to be one of the free variables of \(e'\)? Give an example where this causes a variable to be captured.
Can capture-avoiding substitution be implemented by renaming the free variables of \(e\)? Explain why or why not.

Source: Exercise 6.16, Design Concepts in Programming Languages

2.3. Desugaring (20 points)

When adding a new construct to a language, it's often convenient to describe its semantics by giving a translation, or desugaring, into the existing language constructs.

For example, let's extend the functional language FL, that we discussed in the notes, with let bindings. A let binding (here we're only considering non recursive let bindings; this translation does not work for recursive let bindings) let x = e in e' should behave like a variable declaration, so let x = 1 in x + 2 would evaluate to 3. We could extend the semantics of FL to incorporate let bindings, but it's easier to describe how to write let bindings using the constructs we already have: functions and application. Desugaring let x = e in e' into (fun x -> e') e gives the desired semantics.

We can write our desugaring as a rule \(\texttt{let}\ x = e\ \texttt{in}\ e' \Rightarrow (\texttt{fun}\ x \rightarrow e')\ e\) which says that any let can be translated into an application of a function.

Extend FL with the following when construct by giving a rule for desugaring it into multiple if-then-else clauses: when cond1 -> expr1 | ... | condN -> exprN.

To show how the when construct works, consider the following example, which consists of clauses x > 0 -> y + 1, x < 0 -> y - 1, and true -> y. A clause has a condition (an expression that evaluates to true or false) on its left-hand-side and an arbitrary expression on its right-hand-side. The conditions are evaluated in order, and the right-hand-side corresponding to the first true condition is evaluated and returned.
```
(fun x y -> when
  | x > 0 -> y + 1
  | x < 0 -> y - 1
  | true -> y) 1 2
```
This example evaluates to 3, because x > 0 is the first true condition and y + 1 is 3 if y = 2.
Extend FL with pairs. Give a desugaring for the pair constructor pair x y and the two accessors fst p and snd p. The pair constructor and accessors should satisfy the following invariants:
- For all values x and y, fst (pair x y) = x and snd (pair x y) = y.

3. Scoping design challenge (40 points)

You are presented with the following language design.

J is a scripting language with a lightweight object system (it's JavaScript, but we want you to ignore the class system, arrow functions, prototypes, etc—treat this as a new language that only has objects and functions). Rather than creating objects using classes, J has first-class functions and object literals: Here's a J program that creates a counter object and increments it:

let counter = {
  count: 0,
  incr: function () {this.count = this.count + 1}
};
counter.incr();
console.log(counter);

{ count: 1, incr: [Function: incr] }

J has an unusual design feature: the this keyword. this can refer to several things, according to the following rules:

When this is accessed inside a function:
- If the function was called by accessing it from an object o, then this refers to o.
- If the function is called directly, then this has the value undefined.
When this is accessed outside a function (this includes accesses inside an object, but outside a function), it refers to the "global object"—a singleton object created at the start of execution.

You can see why the designers of J added this to the language; it makes object-oriented programming simpler, which makes J more familiar to programmers coming from Java or C++. However, the semantics of this are generally considered unnatural.

We can demonstrate the strange semantics of this with two examples:

The scoping rule for this is dynamic, not lexical. The following creates a counter that behaves identically to the code above:
```
let counter = {
  count: 0
};
counter.incr = function () {this.count = this.count + 1};
counter.incr();
console.log(counter);
```
```
{ count: 1, incr: [Function (anonymous)] }
```
If this were lexically scoped to the body of the object literal, this code would not work. Instead, the value of this depends on the way that incr is called.

To make this point clearer, if we take the incr method out of the object and call it directly, the counter does not increment (and in fact the function call will throw an exception), because this is now undefined.
Example A: Object method reference
```
let counter = {
  count: 0,
  incr: function () {this.count = this.count + 1},
};
let incr = counter.incr;
incr();
```
This is a reasonable thing to do in real code (maybe you want to make many calls to a method on an object with a long name), so we'd like this idiom to be easy to express in your revised version of J.
Moving this between nested scopes is problematic: The following code works:
```
let counter = {
    count: 0,
    print: function () { console.log(this.count); }
};
counter.print();
```
```
0
```
But what if we want to print output after a delay? We can use the setTimeout function, which takes a callback and a time to wait:
```
let counter = {
    count: 0,
    print: function () {
	setTimeout(function() {
	    console.log(this.count);
	}, 100);
    }
};
counter.print();
```
```
undefined
```
That broke our printing function! Why?

this is now inside another function, and that function is not being called as a method on counter. The correct version does something like this:
Example B: reference to this in a nested scope
```
let counter = {
    count: 0,
    print: function () {
	let this_ = this;
	setTimeout(function() {
	    console.log(this_.count);
	}, 100);
    }
};
counter.print();
```
```
0
```
Functions that take a callback are common in J, and this workaround is clumsy. We'd like to modify the rules for this so that refactoring by moving code in or out of a function is safer. Your revised version of J should make it easy to express code like this.
J is still a dynamic language, and programmers should be able to create objects dynamically. Your modifications to J should allow programmers a reasonable way to write the following code, which implements a generic object cloning function. Note that the semantics of this means that objects that are written in the standard way will be correctly cloned.
Example C: dynamic object construction
```
function clone(obj) {
    let rv = {};
    for (let x in obj){
	rv[x] = obj[x];
    }
    return rv;
}
```

Propose an alternative design for J that either improves the semantics of this or replaces it with a better mechanism.

Write an explanation of your new design and justify your design choices (2-3 paragraphs). The following questions are good starting points:

Is your new design simpler than the original? Easier to understand? Why?
Are there potential programming errors that your design avoids?
Is your design more/less expressive than the original? If less, can you still write normal object-oriented programs?

Explain how examples A, B, and C would be written in your new design and why your design is an improvement over the original J.

You do not need to formalize your design.

1. Problem set 2 peer feedback (20 points)#

2. Functional languages (40 points)#

2.1. Free and bound variables (5 points)#

2.2. Substitution (15 points)#