Module Ez_html.XmlParser
type source
=
|
SFile of string
|
SChannel of Stdlib.in_channel
|
SString of string
|
SLexbuf of Stdlib.Lexing.lexbuf
Several kind of resources can contain Xml documents.
val make : unit -> t
This function returns a new parser with default options.
val prove : t -> bool -> unit
This function enable or disable automatic DTD proving with the parser. Note that Xml documents having no reference to a DTD are never proved when parsed (but you can prove them later using the
Dtd
module (by default, prove is true).
val resolve : t -> (string -> Dtd.checked) -> unit
When parsing an Xml document from a file using the
Xml.parse_file
function, the DTD file if declared by the Xml document has to be in the same directory as the xml file. When using other parsing functions, such as on a string or on a channel, the parser will raise every timeXml.File_not_found
if a DTD file is needed and prove enabled. To enable the DTD loading of the file, the user have to configure the Xml parser with aresolve
function which is taking as argument the DTD filename and is returning a checked DTD. The user can then implement any kind of DTD loading strategy, and can use theDtd
module functions to parse and check the DTD file (by default, the resolve function is raisingXml.File_not_found
).
val check_eof : t -> bool -> unit
When a Xml document is parsed, the parser will check that the end of the document is reached, so for example parsing
"<A/><B/>"
will fail instead of returning only the A element. You can turn off this check by settingcheck_eof
tofalse
(by default, check_eof is true).
val parse : t -> source -> Xml_types.xml
Once the parser is configurated, you can run the parser on a any kind of xml document source to parse its contents into an Xml data structure.
val concat_pcdata : t -> bool -> unit
When several PCData elements are separated by a \n (or \r\n), you can either split the PCData in two distincts PCData or merge them with \n as separator into one PCData. The default behavior is to concat the PCData, but this can be changed for a given parser with this flag.