[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: What is a lightweight language



Don Blaheta wrote:
> 
> Quoth Paul Prescod:
> > XML's tag-redundancy provides much better error recovery
> 
> For every error you show me that is fixed by this, I'll show you a place
> where I went to change an <h1> to an <h3>, or an <ol> to a <ul>, or some
> more complicated change, and forgot to change the matching close tag. :P

As long as your XML parser quickly gets you to the place where the where
the error occurs, you can fix it much more quickly than the person
trying to figure out where they forgot a paren. And I could easily
imagine an Emacs that detected the end-tag and updated it for you.

>...
> The verbosity actually hurts the readability quite a bit, imo.  

Opinions vary. When I am at the bottom of a document working towards the
top, looking for a section end, I do a reverse search for </section>.
With s-expressions I have to go to the top and depend on brace-matching
software.

>...
> > XML is not ASCII text, but Unicode, whereas the Common Lisp and Scheme
> > languages predate Unicode.
> 
> So what?  S-expressions are not the exclusive domain of these languages,
> and it's a no-brainer to imagine S-expressions that allow unicode text
> instead.  

So we define an s-expression competitor that is incompatible with the
environments that currently use them. Can you see that there is a
non-trivial cost to that?

> ... In fact, if you have access to Unicode, you could further
> imagine using a non-ASCII bracketing character to avoid the
> abovementioned paren problem.

That wouldn't be a good idea because people edit their "Unicode"
documents in ASCII text editors.

>...
> 
> Well, Scheme and LISP s-expressions do.  In my day-to-day research (in

Can you see that your day-to-day research is miles away from web page
development, technical publishing or business document interchange?

>...
> The verbosity is the *least* of the problems here.  The underlying text
> is entirely lost under the weight of the markup; it is not visually
> offset from the markup at all, either.  

Actually it seems more visually offset to my XML-trained eyes than the
S-expressions. To me, <FOO>text</FOO> bar makes the text distinct
because it is *outside* of markup whereas (FOO text) kind of blends
them.

>... But the XML
> format would clearly be sending many sentences into pagewrap-land.

There is no doubt that XML is verbose. That verbosity makes it much
easier to visually parse from the bottom and makes error reporting much
more accurate. Obviously different people way these costs and benefits
differently. All I can say is that the XML community chose, after years
of experience with both ways, to vote in favour of redundant and
explicit end-tags.

> And I haven't even gotten started on how much easier sexps are to parse
> than XML.  They're a simple READ statement in Scheme, of course, but
> even in C, a page of lex and yacc will do the trick.

In my experience it takes roughly one line in C, Perl, Python or almost
any language. Typically the line instantiates a parser object and hands
it a filename. Not unlike READ.

The tricky part in handling XML is handling impedance mismatches between
the domain of the XML vocabulary and your application objects. Because
S-expressions are so rarely interchanged between homogenous systems,
this difficulty is not apparent. You just hand the program an
S-expression and it immediately has a useful data structure. Well that's
because the s-expressions are usually tailored for the needs of *that
particular program*. Python has pickle. Perl has Data::Dumper. If you
have a homogenous or one-program system that can get away with using
these instead of XML, you should probably do so.

 Paul Prescod