Did you know ... | Search Documentation: |
International source files |
As discussed in section 2.18, SWI-Prolog supports international character handling. Its internal encoding is UNICODE. I/O streams convert to/from this internal format. This section discusses the options for source files not in US-ASCII.
SWI-Prolog can read files in any of the encodings described in
section 2.18. Two
encodings are of particular interest. The
text
encoding deals with the current locale, the
default used by this computer for representing text files. The encodings
utf8
, unicode_le
and unicode_be
are
UNICODE encodings: they can represent---in the same
file---characters of virtually any known language. In addition, they do
so unambiguously.
If one wants to represent non US-ASCII text as Prolog terms in a source file, there are several options:
\
octal\
. The numerical argument is
interpreted as a UNICODE character.44To
my knowledge, the ISO escape sequence is limited to 3 octal digits,
which means most characters cannot be represented. The
resulting Prolog file is strict 7-bit US-ASCII, but if there are many
NON-ASCII characters it becomes very unreadable.
:- encoding(utf8).
Many of today's text editors, including PceEmacs, are capable of editing
UTF-8 files. Projects that were started using local conventions can be
re-coded using the Unix
iconv tool or often using commands offered by the editor.