[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Matching weird patterns with macros



I have a couple of questions about matching patterns with macros ... I
import entity definitions directly into the xml-parser as compiled
code, so this:

...
<!-- Special characters for HTML -->
<!ENTITY lt CDATA "&#60;" -- less-than sign -->
<!ENTITY gt CDATA "&#62;" -- greater-than sign -->
...

becomes this:

...
*entities*[#"lt"] := make(<character-reference>, name: "#60", char:
'<');
*entities*[#"gt"] := make(<character-reference>, name: "#62", char:
'>');
...

Actually, my questions concern some limitations to the above.  Is
there a way to match "--"?  And also, d2c pukes unless I manually
insert separators between entities (e.g. the semicolon), so the
original list becomes:

...
<!ENTITY lt CDATA "&#60;"  less-than sign >;
<!ENTITY gt CDATA "&#62;"  greater-than sign >;
...

(n.b. no comment tags as I get complaints from the compiler).

Here's the macro I use to compile the entity declarations (which lives
in gd/examples/xml-parser/latin1-entities.dylan):

// --- begin-code ---
define macro entities-definer
{ define entities ?:name ?entities:* end } =>
 { define function initialize-latin1-entities() => ()
     ?name := make(<table>);
     do(method(x) ?name[x.head] := x.tail end, list(?entities));
   end function initialize-latin1-entities }

 entities:
   { } => { }
   { ?data; ... } => { ?data, ... }

 data:
   { <!entity ?ent:name cdata ?charref:expression ?comment-body
?:token }
     => { list(?#"ent", make(<char-reference>, char: ?charref.as-char,
			     name: copy-sequence(?charref, start: 1,
						 end: ?charref.size - 1))) }

 comment-body:
  { } => { }
  { ?:name ... } => { ?name, ... }
end macro entities-definer;

// --- end-code ---

So,

1) is there a way to match weird patterns, such as "--" and "<!" (I'm
not clear if the DRM forbids this), and
2) is there a way to avoid the need to manually insert a separator?

For 2), I know that a space can act as a separator for smaller
fragments:

define macro qw // a la perl
{ qw(?words) } => { ?words }
  words:
    { } => { }
    { ?:name ... } => { ?"name" " " ... }
end macro qw;

format-out("%s!\n", qw(Hello world from Doug Auclair));

works fine for these small fragments .. can it work for entity
declarations, too?

Sincerely,
Doug Auclair