set_sgml_parser(+Parser,
+Option)Sets attributes to the parser. Currently defined attributes:
- file(File)
- Sets the file for reporting errors and warnings. Sets the line to 1.
- line(Line)
- Sets the current line. Useful if the stream is not at the start of the
(file) object for generating proper line-numbers.
- linepos(LinePos)
- Sets notion of the current column in the source line.
- charpos(Offset)
- Sets the current character location. See also the
file(File)
option.
- position(Position)
- Set source location from a stream position term as obtained using
stream_property(Stream, position(Position))
.
- dialect(Dialect)
- Set the markup dialect. Known dialects:
- sgml
- The default dialect is to process as SGML. This implies markup is
case-insensitive and standard SGML abbreviation is allowed (abreviated
attributes and omitted tags).
- html
- html4
- This is the same as
sgml
, but implies shorttag(false)
and accepts XML empty element declarations (e.g.,
<img src="..."/>
).
- html5
- In addition to
html
, accept attributes named data-
without warning. This value initialises the charset to UTF-8.
- xhtml
- xhtml5
- These document types are processed as
xml
. Dialect
xhtml5
accepts attributes named data-
without
warning.
- xml
- This dialect is selected automatically if the processing instruction
<?xml ...>
is encountered. See section
3.3 for details.
- xmlns
- Process file as XML file with namespace support. See section
3.3.1 for details. See also the
qualify_attributes
option below.
- xmlns(+URI)
- Set the default namespace of the outer environment. This option is
provided to process partial XML content with proper namespace
resolution.
- xmlns(+NS, +URI)
- Specify a namespace for the outer environment. This option is provided
to process partial XML content with proper namespace resolution.
- qualify_attributes(Boolean)
- How to handle unqualified attribute (i.e. without an explicit namespace)
in XML namespace (
xmlns
) mode. Default and standard
compliant is not to qualify such elements. If true
, such
attributes are qualified with the namespace of the element they appear
in. This option is for backward compatibility as this is the behaviour
of older versions. In addition, the namespace document suggests
unqualified attributes are often interpreted in the namespace of their
element.
- space(SpaceMode)
- Define the initial handling of white-space in PCDATA. This attribute is
described in section 3.2.
- number(NumberMode)
- If
token
(default), attributes of type number are passed as
a Prolog atom. If integer
, such attributes are translated
into Prolog integers. If the conversion fails (e.g. due to overflow) a
warning is issued and the value is passed as an atom.
- encoding(Encoding)
- Set the initial encoding. The default initial encoding for XML documents
is UTF-8 and for SGML documents ISO-8859-1. XML documents may change the
encoding using the
encoding=
attribute in the header.
Explicit use of this option is only required to parse non-conforming
documents. Currently accepted values are iso-8859-1
and
utf-8
.
- doctype(Element)
- Defines the toplevel element expected. If a
<!DOCTYPE
declaration has been parsed, the default is the defined doctype. The
parser can be instructed to accept the first element encountered as the
toplevel using doctype(_)
. This feature is especially
useful when parsing part of a document (see the parse
option to
sgml_parse/2.