Notes
Outline
Procedural Macros
for Java
Jonathan Bachrach
MIT AI Lab
Keith Playford
Functional Objects, Inc.
Say What?
Lisp-style macro power and simplicity for Java
Debt to Dylan and Lisp is great
Seamlessly procedural
WYSIWYG
Mostly hygienic
Source level debuggable
New material would appreciate feedback
Procedural Macro Motivation
Analysis and rewriting no longer constrained
Simplified pattern matching and rewrite rule engine
Can package and re-use syntax expansion utilities
Pattern matching engine is extensible
Overview
Definitions
Parsing
Execution Model
Fragments
CodeQuotes
Pattern Matching
Hygiene
Nested CodeQuotes
Debugging
Comparisons
Implementation
Future Work
Macro
Syntactic extension to core language
Defines meaning of one construct in terms of other constructs
Their declarations are called macro definitions
Their uses are called macro calls
Macro Motivation
“From now on, a main goal in designing a language should be to plan for growth.”
Guy Steele, “Growing a Language”, OOPSLA-98, invited talk
Power of abstraction where functional abstraction won’t suffice, affording:
Clarity
Concision
Implementation hiding
People are lazy
do the right thing
forEach Example
TestSuite Macros
Declarative Data Examples
Compilation
Parse into skeletal syntax tree
Recursively expand macros top-down
Create IR
Optimize IR
Emit code
Macro Expansion
Replaces macro call with another construct, which itself can contain macro calls.
This process repeats until there are no macro calls remaining in the program
Macro Shapes
Statement:
try, while, …
Call:
assert(?:expression, ?message:expression)
Special:
methods and fields
Java Syntax
Java forms fit the pattern
… clause clause clause etc
where clauses are:
thing …;
thing … { }
Also expressions
Parsing Macros
Macros can occur at certain known positions
Declarations
Statements
Expressions
While parsing in a macro position look for macro name
Load SyntaxExpander for information
Find macro extent
Find start and end points
Tokens in between serve as macro call arguments
Parsing Function Call Macros
Start is function name
End is last matching argument parenthesis
Example
#{ f(assert(x < 0, “bad ” + x), g(y)) }
assert’s Macro arguments would be:
#{ assert(x < 0, “bad ” + x) }
Parsing Statement Macros
Start is first token past last terminator
End is first terminator not followed by a continuation word
Example
#{ f(x); do x++; while (x < 0); g(y); }
do’s macro arguments would be:
#{ do x++; while (x < 0); }
Execution Model
Macro Expanders are implemented as classes whose names are the name of the macro with a “SyntaxExpander” suffix
forEach’s macro class would be named forEachSyntaxExpander
Coexist with runtime Java source and class files
Dynamically loaded into compiler on demand during build
Macros are looked up using the usual Java class lookup strategy:
Package scoped
Class scoped
Macro Class
Contains
Access specifier
Name
Shape
Continuation words
Expand method
Extra class declarations
Fragments
Fragments library provides a collection of classes suitable for representing fragments of source code in skeleton syntax tree form
Skeleton syntax trees may be constructed and pulled apart manually using the exported interface of these classes
Source level tools are provided for easier parsing and construction
IO facility permits outputting a re-readable text form and constructing trees from a text form
Fragment Classes
Skeleton Syntax Tree Example
f(x, y) + g[0] ;
CodeQuote
Like Lisp’s quasiquote (QQ)
WYSIWYG / concrete representation of code within #{}’s
#{ if (isOn()) turnOff(); }
Evaluation of codeQuote yields skeleton form representation of code
Quoting is turned off with ? (like QQ’s comma)
Variables: ?x
Expressions: ?(f(x))
CodeQuote Example One
Fragment test = #{ isOn() };
Fragment then = #{ turnOff(); };
return #{ if (?test) ?then };
=>
#{ if (isOn()) turnOff(); }
CodeQuote Example Two
Pattern Matching
syntaxSwitch: like switch statement
syntaxSwitch (?:expression) { rules:* }
Rule:
case ?pattern:codeQuote : ?:body
Pattern:
CodeQuote that looks like the construct to be matched
Augmented by pattern variables which match and lexically bind to appropriate parts of the construct
syntaxSwitch Example
syntaxSwitch Evaluation
syntaxSwitch (?:expression) { rules:* }
Expression must evaluate to a valid Fragment
Expression is tested against each rule’s pattern in turn
If one of the patterns matches, its pattern variables are bound to local variables of the same name and the corresponding right hand side body is run in the context of those bindings
If no patterns are found to match, then an exception is thrown
Pattern Variables
Denoted with ? prefixing their names
Have constraints that restrict the syntactic type of fragments that they match
Examples: name, expression, body, …
Constraint denoted with a colon separated suffix (e.g., ?class:name)
Variable name defaults to constraint name (e.g., ?:type is the same as ?type:type)
Wildcard constraint (*) matches anything
Ellipsis (…)is an abbreviation for wildcard
Pattern Matching Execution
Left to right processing
Shortest first priority for wildcard variables
Largest first priority for non-wildcard variables
Patterns match if and only if all of its subpatterns match
Simple folding rules are applied to commas to make matching lists easier
forEach Macro
syntax Macro
Hygiene and Referential Transparency
Variable references copied from a macro call and definition mean the same thing in an expansion
Avoids the need for gensym and for manually exporting names used in macro definitions
Hygiene Design
Each template name records its original name, lexical context, and specific macro call context
A named value reference and a binding connect if and only if the original name and the specific macro call occurrences are both the same
Hygiene context is dynamically bound during expansion
Hygiene contexts can also be manually established and dynamically bound
References to global bindings should mean the same thing as it did in macro definition
Hard to do in Java without violating security
Forces user to manually export macro uses
Parallel iteration
accessible Macro
User Defined Constraints
Based on class
Whose name is constraintName + “SyntaxConstraint”
Loaded on demand using standard Java class loading mechanism
That implements the SyntaxConstraint protocol
String getName()
boolean isAdmissable(SequenceFragment frags)
Nested CodeQuotes
Introduce nested pattern variables and expressions:
??x, ??(f(x)), ???y
Evaluate when var/expr’s nesting level equals codeQuote nesting level otherwise regenerate:
#{ #{ ?x } }                       => #{ ?x }
Fragment x = #{ a }; #{ #{ ??x } } => #{ a }
Can keep ?’s using !
Fragment x = #{ y }; Fragment y = #{ a }; #{ #{ ?!?x } }
=> #{ ?y }
Macro Defining Macros
Self Generating Code Quote
Self Generating Java Program
Tracing
Either globally or locally
Print when and to what pattern variables match
Print when patterns do or do not match
Debugging
Maintain source locations
If integrated into compiler can also maintain macro expansion context to support error trailing through macro expansion
Comparisons
Dylan
Lisp
R5RS
Grammar Extensions
See Brabrand and Schwartzbach for CPP, M4, TeX, C++ templates
Dylan Macros
More complicated pattern matching
More limited shapes
Not procedural (although Bachrach and Playford have proposed a procedural extension)
Lisp Macros
Quasiquote is a bit more complicated (but potentially more powerful):
No quote
No unquote-splicing
,’,x versus ??x and ,,x versus ?!?x
Variable capture is a problem
Macro calls are difficult to debug
R5RS Macros
syntax-rules is not procedural
Two environments
… is cute but brittle
Pattern variables are defaults
No hygiene escapes
Local syntax
Other Scheme macro systems exist
Grammar Extension Macros
“Programmable Syntax Macros”
by Weise and Crew and
“Growing Languages with Metamorphic Syntax Macros”
by Brabrand and Schwartzbach
Harder to understand rules
Introduce global reserve words
More awkward to write complicated macros
Type checkable
Implementation
Preprocessor
Takes .jpm files
Produces .java files preserving line numbers
Optionally calls Java Compiler
Uses standard ANTLR lexer and parser
Tracing, Error Trailing and Hygiene not implemented
Available for Academic usage soon
www.ai.mit.edu/~jrb/jpm
Future Work
Symbol Macros
Generalized Variables (e.g., setf)
Revisit nested CodeQuotes
Restricted macro call contexts
Type checking
Integration with compiler (e.g., Kopi)
Performance
Ambiguous use of ? ?
Other languages (e.g., Scheme, C, …)
Credits
Dave Moon -- Dr. Dylan Macros
Alan Bawden -- Dr. Quasiquote
Thomas Mack -- UROP
Howie & Bob -- Drs. Funding
Benefitted from discussions with
Greg Sullivan
Scott McKay
Andrew Blumberg