Module Utf8_char


module Utf8_char: sig .. end
Defines an abstraction and some utilities for UTF-8 encoded characters.

type t 
the type of UTF-8 encoded characters.
exception Bad_encoding
raised if an invalid UTF-8 encoding is encountered
val of_char : char -> t
Creates a UTF-8 character from a regular ASCII character.
Raises Bad_encoding if the character is outside the range 0-127
val of_bytes : string -> t
Creates a UTF-8 character from a string of bytes.

Warning! Does not check if the bytes are valid! Use to_U_char to check.

val to_bytes : t -> string
Emits a byte string from a UTF-8 character
val size : t -> int
Determines the size, in bytes, of the UTF-8 character.
val to_U_char : t -> U_char.t
Decode the character to a UNICODE character.
Raises Bad_encoding if there was a decoding error.
val utf8_character_size : int -> int
Decoding utility: given just the first byte of a utf-8 character, determines how many bytes the whole character should be. (In UTF-8, this information is encoded in the first byte)