Programming Language Design

6.S050

Created: 2023-05-11 Thu 11:06

Staff

Schedule

Lectures: TR 1-2:30 in 56-154

No recitations.

Office hours: TBD. Will be on course website & canvas.

Assignments

  • 6 problem sets, roughly one every 2 weeks
  • Reading response

Problem Sets

  • Design questions—short answer
  • Implementing interpreters in Python
  • 6 problem sets, roughly every 2 weeks

Peer Feedback

  • Will assign another student to give feedback on design portions of problem sets
  • You will be graded based on the quality of the feedback you give
  • Starting with Pset 2

Reading Response

  • Read a classic paper on programming language design
  • Write a short response (250 words)
  • We'll give you a question or prompt
  • 5 reading responses, alternating w/ problem sets

Grading

  • 15% per problem set
  • 10% total for reading responses

Late policy

  • Problem sets due @ 11:59pm
  • Late homework will be accepted for up to 3 days
  • 10% grading penalty per day

Extensions

  • Contact course staff before assignment is due
  • We will give you a 24 hour extension
  • If you need more, get in touch

Collaboration

  • You are welcome to discuss the problem sets with your classmates, but
  • Anything you turn in should be your own work
  • Citing sources is required in any written assignments

Why learn to design programming languages?

My answer

marge.png

Your answer

  • Write better code
  • Make more informed choices between languages
  • Learn new languages quicker
  • … or maybe you just think programming languages are neat!

Parts of a language design

  • Syntax—what programs look like
  • Semantics—how programs behave
  • Pragmatics—how programs are executed efficiently

What we'll cover

  • Syntax—what programs look like
  • Semantics—how programs behave
  • Pragmatics—how programs are executed efficiently

What this class is and is not

  • Is
    • A broad look at language design
    • An examination of many different language features
    • Implementation focus on simplicity, not efficiency
  • Is not
    • A compilers class
    • No discussion of optimization

A quick warning

  • This is a new class
  • We're still working on the material
  • We think it will be fun!
    • …but there will be bugs

Why is PL design important?

https://www.destroyallsoftware.com/talks/wat

More seriously

"Null references: the billion dollar mistake" — Tony Hoare

  • Trades off safety for performance
  • Regaining safety requires:
    • A verbose check at every dereference
    • Static analysis

Memory safety vulnerabilities

vuln.png

Source

Need for parallelism

dennard.png

Source

Performance gap

matmul-perf.png

Source

Growing codebases

loc-1.png

Source

Growing codebases

loc-2.png

Source

What does language design look like?

  • Design tasks on a spectrum
    • Engineering — lots of constraints, clear correctness criteria
    • Architecture
    • Fine art — relatively few constraints, highly creative
  • Language design ~ architecture
    • Some hard constraints
    • Lots of room for creativity

Language design criteria

  • What makes a language well designed?
  • Who are the users of PLs?
    • Programmers
    • Other programs/tools

Ergonomics for programmers

Ergonomics for tools

Anatomy of a language

  • as bag of features
  • point in high dimensional feature space

Design axes

  • Syntax
  • Semantics
    • Naming
    • Control
    • Types
    • State
    • Data representation

Syntax

 IDENTIFICATION        DIVISION.
 PROGRAM-ID.           fizzbuzz.
 DATA                  DIVISION.
 WORKING-STORAGE       SECTION.
 01 CNT      PIC 9(03) VALUE 1.
 01 REM      PIC 9(03) VALUE 0.
 01 QUOTIENT PIC 9(03) VALUE 0.
 PROCEDURE             DIVISION.
*
 PERFORM UNTIL CNT > 100
   DIVIDE 15 INTO CNT GIVING QUOTIENT REMAINDER REM
   IF REM = 0
     THEN
       DISPLAY "FizzBuzz " WITH NO ADVANCING
     ELSE
       DIVIDE 3 INTO CNT GIVING QUOTIENT REMAINDER REM
       IF REM = 0
	 THEN
	   DISPLAY "Fizz " WITH NO ADVANCING
	 ELSE
	   DIVIDE 5 INTO CNT GIVING QUOTIENT REMAINDER REM
	   IF REM = 0
	     THEN
	       DISPLAY "Buzz " WITH NO ADVANCING
	     ELSE
	       DISPLAY CNT " " WITH NO ADVANCING
	   END-IF
       END-IF
   END-IF
   ADD 1 TO CNT
 END-PERFORM
 DISPLAY ""
 STOP RUN.

Syntax

let () =
  for i = 1 to 100 do
    let str = match i mod 3, i mod 5 with
      | 0, 0 -> "FizzBuzz"
      | 0, _ -> "Fizz"
      | _, 0 -> "Buzz"
      | _    -> string_of_int i
    in
    Printf.printf "%s " str
  done

Syntax

  • What makes syntax easy to read?
  • How should we specify syntax?

Naming

var funcs = [];
// let's create 3 functions
for (var i = 0; i < 3; i++) {
  // and store them in funcs
  funcs[i] = function() {
    // each should log its value.
    console.log("My value: " + i);
  };
}
for (var j = 0; j < 3; j++) {
  // and now let's run each one to see
  funcs[j]();
}

Naming

function myOuterFunc() {
  var date = new Date();
  function myInnerFunc() {
    console.log("The current date is " + date)
    if (true) {
      // Whoops, we accidentally reused this variable name!
      var date = "a string"
    }
  }
  myInnerFunc()
}

Source

Naming

function myOuterFunc() {
  var date = new Date();
  function myInnerFunc() {
    var date = undefined;
    console.log("The current date is " + date)
    if (true) {
      // Whoops, we accidentally reused this variable name!
      date = "a string"
    }
  }
  myInnerFunc()
}

Source

Naming

  • Which objects in a language can be named?
  • How are names organized & resolved?

Control

ON CONDITION(OVERDRAFT) BEGIN;
  ...
END;

IF ACCOUNT_BALANCE < TOTAL WITHDRAWAL
THEN SIGNAL CONDITION(OVERDRAFT);

Control

  • What kinds of control-flow constructs should be provided?
    • …for describing loops & branches
    • …for handling errors

Types

String[] strings = new String[2];
Object[] objects = strings;  // valid, String[] is Object[]
objects[0] = 12;

Types

  • How can types be used to avoid common programming errors?
  • Can we reuse the semantic tools that we developed for specifying behavior to specify type systems?
  • Types are a deep subject
    • If you love this part of the class, you might also love:
    • 6.5110 Foundations of Program Analysis
    • 6.5120 Formal Reasoning About Programs

State

let mut x = "123".to_string();
let y = &mut x;

x.push_str("456");

println!("y = {}", y);
error[E0499]: cannot borrow `x` as mutable more than once at a time

Source

State

  • How can we program without mutable state?
  • How can languages make mutable state safe and easy to reason about?
  • How should languages handle resource allocation/deallocation?

Data Representation

data Exp = Num Int
	 | Add Exp Exp
	 | Sub Exp Exp
	 | Mul Exp Exp
	 | Div Exp Exp

eval :: (Num a, Integral a) => Exp -> a
eval e = case e of
    Num x   -> fromIntegral x
    Add a b -> eval a   +   eval b
    Sub a b -> eval a   -   eval b
    Mul a b -> eval a   *   eval b
    Div a b -> eval a `div` eval b

Data Representation

  • How can we give programmers expressive, powerful tools to describe diverse data structures?
  • How can we make it easy to work with complex data structures?
  • How should we represent and process infinite structures?