Created: 2023-05-11 Thu 11:06
<expr> ::= <expr> "-" <expr> | <expr1> <expr1> ::= <expr1> "*" <expr1> | <expr2> <expr2> ::= <id> | "(" <expr> ")"
(x - y) * z
x - (y - z)
<expr> ::= <expr> "-" <expr1> | <expr1> <expr1> ::= <expr2> "*" <expr1> | <expr2> <expr2> ::= <id> | "(" <expr> ")"
(x - y) * z
and x - (y - z)
Extend the expression grammar with exponentiation "^" and unary negation "~" operators.
<expr> ::= <expr> "-" <expr1> | <expr1> <expr1> ::= <expr2> "*" <expr1> | <expr2> <expr2> ::= <id> | "(" <expr> ")"
a^b^c*d
should parse as (a^(b^c))*d
<expr> ::= <expr> "-" <expr1> | <expr1> <expr1> ::= <expr2> "*" <expr1> | <expr2> <expr2> ::= <expr3> "^" <expr2> | <expr3> <expr3> ::= "~" <expr4> | <expr4> <expr4> ::= <id> | "(" <expr> ")"
Many languages have an if-then-else construct with an optional else clause, e.g.:
if A then B else C
if A then B
How should we read the following code?
if A then if B then C else D
if A then (if B then C) else D
or
if A then (if B then C else D)
else
is attached to closest if
1 - 2
(int) -2
?(x) -2
?
<expr> ::= <expr> "-" <expr1> | <expr1> <expr1> ::= <expr1> "*" <expr2> | <expr2> <expr2> ::= <id> | "(" <expr> ")"
<expr> ::= <id> | <expr> ("-" | "*") <expr>
from dataclasses import dataclass
@dataclass
class Expr:
pass
@dataclass
class Id(Expr):
name: str
@dataclass
class Binop(Expr):
op: str
lhs: Expr
rhs: Expr
ast = Binop('-', Binop('-', Id('x'), Id('y')), Binop('*', Id('z'), Id('z')))
class Parser: def expr(self): # Parse <expr> ::= <expr> "-" <expr1> | <expr1> pass def expr1(self): # Parse <expr1> ::= <expr2> "*" <expr1> | <expr2> pass def expr2(self): # Parse <expr2> ::= <id> | "(" <expr> ")" pass
class Parser: def expr2(self): # Parse <expr2> ::= <id> | "(" <expr> ")" if next_token_is_id(): return Id(token) elif next_token_is_lparen(): # Parse "(" <expr> ")" pass else: error()
class Parser: def expr2(self): # Parse <expr2> ::= <id> | "(" <expr> ")" if next_token_is_id(): return Id(token) elif next_token_is_lparen(): expr = self.expr() assert_next_token_is_rparen() return expr else: error()
class Parser: def expr1(self): # Parse <expr1> ::= <expr2> "*" <expr1> | <expr2> lhs = self.expr2() if next_token_is_star(): return Binop("*", lhs, self.expr1()) else: return lhs
class Parser: def expr1(self): # Parse <expr1> ::= <expr2> "*" <expr1> | <expr2> lhs = self.expr2() if next_token_is_star(): return Binop("*", lhs, self.expr1()) else: return lhs
<expr1> ::= <expr2> <expr1-rest> <expr1-rest> ::= "*" <expr1> | ε
class Parser: def expr(self): # Parse <expr> ::= <expr> "-" <expr1> | <expr1> lhs = self.expr() # ???
Direct:
<expr> ::= <expr> "-" <expr1> | <expr1>
Indirect:
<expr1> ::= <expr2> "*" <expr1> | <expr2> <expr2> ::= <id> | <expr1>
Convert to iteration:
<expr> ::= <expr1> ("-" <expr1>)*
class Parser: def expr(self): # Parse <expr> ::= <expr> ::= <expr1> ("-" <expr1>)* exprs = [self.expr1()] while next_token_is_dash(): exprs.append(self.expr1()) pass # Process expressions
[Id("a")] -> Id("a") [Id("a"), Id("b")] -> Binop("-", Id("a"), Id("b")) [Id("a"), Id("b"), Id("c")] -> Binop("-", Binop("-", Id("a"), Id("b")), Id("c"))
What if you want users to be able to extend your language syntax?
From least to most powerful:
let (+?) x y = ... in
x +? y
Inherits precedence/associativity from +
(+?) x y = ...
infixl 5 +?
Precedence/associativity specified directly
if_then_else_ : {A : Set} → Bool → A → A → A
if true then x else y = x
if false then x else y = y
@Override
void myMethod() { ... }
Allow users to attach unstructured data to AST
@ToString(includeFieldNames=true)
public static class Square {
private final int width, height;
}
@ToString(includeFieldNames=true)
public static class Square {
private final int width, height;
}
public static class Square { private final int width, height; @Override public String toString() { return "Square(width=" + this.width + ", height=" + this.height + ")"; } }
//go:generate stringer -type=Pill
package painkiller
type Pill int
const (
Placebo Pill = iota
Aspirin
Ibuprofen
Paracetamol
Acetaminophen = Paracetamol
)
#define INC(x) { int a = 0; x++; }
int a = 1; INC(a);