[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Matching weird patterns with macros
- To: address@hidden
- Subject: Matching weird patterns with macros
- From: address@hidden (Douglas M. Auclair)
- Date: Sat, 20 Apr 2002 09:45:17 -0400
- Organization: http://groups.google.com/
- Sender: "Gregory T. Sullivan" <address@hidden>
- Xref: traf.lcs.mit.edu comp.lang.dylan:14126
I have a couple of questions about matching patterns with macros ... I
import entity definitions directly into the xml-parser as compiled
code, so this:
...
<!-- Special characters for HTML -->
<!ENTITY lt CDATA "<" -- less-than sign -->
<!ENTITY gt CDATA ">" -- greater-than sign -->
...
becomes this:
...
*entities*[#"lt"] := make(<character-reference>, name: "#60", char:
'<');
*entities*[#"gt"] := make(<character-reference>, name: "#62", char:
'>');
...
Actually, my questions concern some limitations to the above. Is
there a way to match "--"? And also, d2c pukes unless I manually
insert separators between entities (e.g. the semicolon), so the
original list becomes:
...
<!ENTITY lt CDATA "<" less-than sign >;
<!ENTITY gt CDATA ">" greater-than sign >;
...
(n.b. no comment tags as I get complaints from the compiler).
Here's the macro I use to compile the entity declarations (which lives
in gd/examples/xml-parser/latin1-entities.dylan):
// --- begin-code ---
define macro entities-definer
{ define entities ?:name ?entities:* end } =>
{ define function initialize-latin1-entities() => ()
?name := make(<table>);
do(method(x) ?name[x.head] := x.tail end, list(?entities));
end function initialize-latin1-entities }
entities:
{ } => { }
{ ?data; ... } => { ?data, ... }
data:
{ <!entity ?ent:name cdata ?charref:expression ?comment-body
?:token }
=> { list(?#"ent", make(<char-reference>, char: ?charref.as-char,
name: copy-sequence(?charref, start: 1,
end: ?charref.size - 1))) }
comment-body:
{ } => { }
{ ?:name ... } => { ?name, ... }
end macro entities-definer;
// --- end-code ---
So,
1) is there a way to match weird patterns, such as "--" and "<!" (I'm
not clear if the DRM forbids this), and
2) is there a way to avoid the need to manually insert a separator?
For 2), I know that a space can act as a separator for smaller
fragments:
define macro qw // a la perl
{ qw(?words) } => { ?words }
words:
{ } => { }
{ ?:name ... } => { ?"name" " " ... }
end macro qw;
format-out("%s!\n", qw(Hello world from Doug Auclair));
works fine for these small fragments .. can it work for entity
declarations, too?
Sincerely,
Doug Auclair