file is an input port or a string naming an existing file containing HTML text. word-proc is a procedure of one argument or #f. markup-proc is a procedure of one argument or #f. white-proc is a procedure of one argument or #f. newline-proc is a procedure of no arguments or #f.
html-for-eachopens and reads characters from port file or the file named by string file. Sequential groups of characters are assembled into strings which are either
- enclosed by ‘<’ and ‘>’ (hypertext markups or comments);
- whitespace; or
- none of the above (words).
Procedures are called according to these distinctions in order of the string's occurrence in file.
newline-proc is called with no arguments for end-of-line not within a markup or comment.
white-proc is called with strings of non-newline whitespace.
markup-proc is called with hypertext markup strings (including ‘<’ and ‘>’).
word-proc is called with the remaining strings.
html-for-eachreturns an unspecified value.
file is an input port or a string naming an existing file containing HTML text. If supplied, limit must be an integer. limit defaults to 1000.
html:read-titleopens and reads HTML from port file or the file named by string file, until reaching the (mandatory) ‘TITLE’ field.
html:read-titlereturns the title string with adjacent whitespaces collapsed to one space.
html:read-titlereturns #f if the title field is empty, absent, if the first character read from file is not ‘#\<’, or if the end of title is not found within the first (approximately) limit words.
htm is a hypertext markup string.
If htm is a (hypertext) comment or DTD, then
htm-fieldsreturns #f. Otherwise
htm-fieldsreturns the hypertext element string consed onto an association list of the attribute name-symbols and values. If the tag ends with "/>", then "/" is appended to the hypertext element string. The name-symbols are created by
string-ci->symbol. Each value is a string; or #t if the name had no value assigned within the markup.