[Prev][Next][Index][Thread]

HTML Tidy Wrapper for Functional Developer



I've uploaded to my site[1] a wrapper around the HTML Tidy COM
component[2].

I used the FD COM interface wizard and then added a couple of higher
level usage functions so clients don't need to use COM routines. 

Basically it allows input of HTML as a string (or a file using the low
level COM wrapper) and returns the tidied string. 

Useful when parsing HTML that has overlapping tags or is otherwise not
well formed. I pass it through HTML tidy then use an html or xml
parser on it. 

Works something like:

  let page = get-web-page(#"http", "www.double.co.nz", 80, "/");
  let (tidied-page, warnings, errors) = html-tidy(page);
  if(warnings)
    format-out("\nWarnings:\n");
    for(warning in warnings)
      format-out("  %s\n", warning)
    end for;
  else
    format-out("No Warnings.");
  end if;

  if(errors)
    format-out("Errors:\n");
    for(error in errors)
      format-out("  %s\n", error)
    end for;
  else
    format-out("No Errors.");
  end if;

[1] http://www.double.co.nz/dylan
[2] http://perso.wanadoo.fr/ablavier/TidyCOM/index.html

Chris.
-- 
http://www.double.co.nz/dylan



Follow-Ups: