module Yaxpo: sig
.. end
Provides XML lexing and basic parsing through a "pull" interface. This is mostly just a lexer; see the Yaxpodom
and Yaxposax
modules for more conventional parsers.
Basic datatypes
exception Lex_error of string
thrown with a description when an error occurs during lexical analysis.
type
txt = string
the type of literal character data
type
cdata = string
the type of CDATA section data
type
qname = {
}
the type of QNames (prefix:local). pfx = "" means no prefix.
type
att = {
|
att_name : qname ; |
|
mutable att_value : txt ; |
}
the type of attributes
Serializers - useful for debugging
val string_of_qname : qname -> string
val string_of_att : att -> string
Pull Interface
type
xml_token =
The type of XML tokens.
exception Parse_error of string * xml_token option
thrown with a description when an error occurs during lexical analysis. May also include the offending token if appropriate.
The distinction between a parse error and a lex error is not too well defined, so you should take it lightly.
val pull_next_token : #Cps_reader.t -> (xml_token -> unit) -> unit
Pulls the next token from the stream. The second argument is a
continuation, which is a function you write. When
pull_next_token
completes, it invokes your continuation with the new token, instead of returning control. Depending on what reader you use, it is possible for
pull_next_token
to return control without having invoked your continuation; this may occur if there is not enough data available.
Important: pull_next_token
only performs local lexical analysis on the next token in the stream. It maintains no call-to-call state and will not raise errors if tokens come in incorrect order, such as an unmatched end tag. Although it is not very difficult to do at all, this is the responsibility of higher-level parsers.
Raises
U_char.Bad_char
if an invalid character is encountered.
Utf8_char.Bad_encoding
if a UTF-8 decoding error occurs.
Lex_error
if a lexical error is encountered.
Parse_error
if a local parsing error is encountered; for example, an attribute value with no end quote.