[Prev][Next][Index][Thread]
Wildcard patterns (was: Re: Macro loops infinitely --- GD bug?)
Keith,
thanks for your valuable comments.
After reading what you wrote, I re-parsed the DRM Macros chapter, and now I
see a much clearer picture. But this also gives a blow to my perception as a
macro writer that I can have a BNF-like parsing on fragments by
auxiliary-rule-sets. So ?pat is extended to ?pat:* if the macro definition
contains an aux rule set #"pat". Also no two wildcarded pattern variables
may occur in a pattern-sequence [1].
I hoped to be able to write a BNF-like 1+ repetition as in
once:
{ ?inner } => { }
{ ?inner; ... } => { ?inner, ... }
but the implicit wildcard constraint in the first rule makes this
impossible, because the match is done and when while expansion the ?inner
pattern is consulted and it does not match it is too late for backtracking.
For my above needs an extra check is needed on the structure of ?inner,
which brought me the idea of a "positive" constraint. Consider this
auxiliary rule set in an "extended" Dylan:
once:
{ ?inner:+; ... } => { ?inner, ... }
The positive constraint would only be available if there was an aux rule set
associated with the pattern variable name, and would fail if the captured
fragment does not have the required inner structure. So basically it would
be behaviorally the same as a wildcard constraint augmented with an extra
check. It would still backtrack trying more and more of the input fragment.
[1] would still be a restriction with positive constraints.
Another variant is also conceivable, a checked constraint that matches the
smallest portion of input that satisfies the required inner structure. Ths
could be an "equal" constraint "?inner:=". Because this constraint does not
invoke backtracking [1] does not apply.
In summary these two extensions could allow a BNF-like parsing discipline.
In my opinion it would be worth it, because domain specific sub-languages
could be better embedded in Dylan. Neel's lazy evaluation combinators come
to my mind for example.
I am a GD implementor too, so I have studied the compiler sources the last
days extensively. I have almost implemented the proposed extensions, so I
would say it is doable.
I am interested in comments from the Dylan community whether something like
the above is worth pursuing.
Thanks for listening.
Gabor
----------
>From: "Keith Playford" <keith.playford@pobox.com>
>To: "Gabor Greif" <ggreif@lucent.com>, <info-dylan@ai.mit.edu>
>Subject: RE: Macro loops infinitely --- GD bug?
>Date: Sam, 6. Jan 2001 2:03 Uhr
>
> To cut to the chase, the easiest way to express this is probably:
>
> define macro one-definer
> { define one ?:name ?once end }
> => { define constant ?name = list(?once) }
>
> once:
> { } => { }
> { ?inner:*; ... } => { ?inner, ... }
>
> inner:
> { } => { }
> { ?:expression, ... } => { ?expression, ... }
> end macro;
>
> There are a couple of reasons for the infinite recursions in the
> other formulations. First, selection of a matching pattern within
> a rule set is performed and committed to without consulting any
> aux-rules referred to[*]. For example, if a macro contains:
>
> rule:
> { ?subrule } => { something }
> { ?any:* } => { something else }
>
> subrule:
> { ?:name } => { ?name }
>
> the ?any:* pattern will never be reached. That's because:
>
> { ?subrule } => { something }
>
> is actually just shorthand for:
>
> { ?subrule:* } => { something }
>
> i.e. an unconstrained wildcard match followed by an aux-rule
> transformation.
>
> The safest way to think of aux-rules is as transformers applied
> after pattern matching and before substitution. A match failure
> in an aux-rule transformation won't cause backtracking within
> the invoking rule set because at that point pattern matching is
> a done deal in that rule set.
>
> The shorthand that allows the constraint to be elided on a pattern
> variable that names an aux-rule set has always been confusing and
> I recommend that people don't use it. Always declare the wildcard
> constraint explicitly.
>
> The other thing that often comes as a surprise is just how many
> cases a pattern like this is happy to match:
>
> { ?stuff:*; ... } => { something }
>
> Between wildcarding and the implicit separator folding rules,
> all of these inputs match (to at least one level):
>
> // the empty fragment
> 1 // an input with no separator at all
> 1; // an input with a terminator
> 1; 2 // an input with a separator
>
> That's why it's important to put unambiguous base cases (like the
> empty patterns in the example) first within a rule set.
>
> -- Keith
>
> [*] Technically there's one exception to this to do with
> "intermediate word" handling, but it's not something macro
> writers have to worry about.
>