[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: text processing as *the* problem



At 11:42 AM -0800 11/28/01, KELLEHER,KEVIN (Non-HP-Roseville,ex1) wrote:
>Congratulations on LL1.  I'm glad that language developers can get together
>and share ideas.
>
>As a language user, I am looking for a language that I can fall in love
>with,
>and have been following the appearance of new languages for several years.
>However, there is a problem space that seems neglected, and that is
>text processing.
>
...

>The sort of problem I often need to solve is to extract the links and
>accompanying
>text from a web page, but only from a certain part of the page.  I would
>like to
>be able to easily program some processing rules, such as "ignore tables that
>contain forms"  or "only collect links that begin with '/news/'".
>
>I've also encountered problems in parsing XML that have required some
>"heavy lifting" in terms of string comparisons that I have had to implement
>in C.  All the while something inside cries out that it shouldn't be so hard
>to do.

Based on your examples, it looks to me like what you need is _not_ a 
language that does powerful text processing, but one that can handle 
structured data.  If you're processing XML or HTML at the text level, 
you're missing out on the real power of these languages.  Yes, you 
need a decent parser, but if you use someone else's, you never need 
to deal with the text at all.

john clements


--