Availability:built-in
code_type(?Code,
?Type)
char_type(?Char,
?Type)Tests or generates alternative Types or Char/Codes.
The character types are inspired by the standard C <ctype.h>
primitives. The types are sensitive to the active locale, see
setlocale/3.
Most of the Types are mapped to the Unicode classification
functions from <wctype.h>, e.g., alnum
uses iswalnum(). The types prolog_var_start,
prolog_atom_start, prolog_identifier_continue
and
prolog_symbol are based on the locale-independent built-in
classification routines that are also used by read/1
and friends.
Note that the mode (-,+) is only efficient if the Type has
a parameter, e.g., char_type(C, digit(8)). If Type
is a atomic, the whole unicode range (0..0x1ffff) is generated and
tested against the character classification function.
- alnum
- Char is a letter (upper- or lowercase) or digit.
- alpha
- Char is a letter (upper- or lowercase).
- csym
- Char is a letter (upper- or lowercase), digit or the
underscore (
_). These are valid C and Prolog symbol
characters.
- csymf
- Char is a letter (upper- or lowercase) or the underscore (
_).
These are valid first characters for C and Prolog symbols.
- ascii
- Char is a 7-bit ASCII character (0..127).
- white
- Char is a space or tab, i.e. white space inside a line.
- cntrl
- Char is an ASCII control character (0..31), ASCII DEL
character (127), or non-ASCII character in the range 128..159 or
8232..8233.
- digit
- Char is a digit, i.e., Char is in 0 ....
See also
decimal.
- digit(Weight)
- Char is a digit with value Weight. I.e.
char_type(X,
digit(6)) yields X = ’6’.
Useful for parsing numbers.
- xdigit(Weight)
- Char is a hexadecimal digit with value Weight.
I.e.
char_type(a, xdigit(X)) yields X = ’10’.
Useful for parsing numbers.
- decimal
- Char is a decimal digit in any script. This implies it has
the Unicode general category Nd).
- decimal(Weight)
- Char is a decimal digit in any script with Weight 0
....
- print
- Char is printable character.
- graph
- Char produces a visible mark on a page when printed. Note
that the space is not included!
- lower
- Char is a lowercase letter.
- lower(Upper)
- Char is a lowercase version of Upper. Only true if
Char is lowercase and Upper uppercase.
- to_lower(Upper)
- Char is a lowercase version of Upper. For
non-letters, or letter without case, Char and Lower
are the same. See also upcase_atom/2
and downcase_atom/2.
- upper
- Char is an uppercase letter.
- upper(Lower)
- Char is an uppercase version of Lower. Only true
if
Char is uppercase and Lower lowercase.
- to_upper(Lower)
- Char is an uppercase version of Lower. For
non-letters, or letter without case, Char and Lower
are the same. See also upcase_atom/2
and downcase_atom/2.
- punct
- Char is a punctuation character. This is a
graph
character that is not a letter or digit.
- space
- Char is some form of layout character (tab, vertical tab,
newline, etc.).
- end_of_file
- Char is -1.
- end_of_line
- Char is one of the four ISO/POSIX line-ending control codes
U+000A LF, U+000B VT, U+000C FF, U+000D CR. This is the original ISO
Prolog and C-string-literal definition;
prolog_end_of_line
is the wider set used by the SWI-Prolog reader.
- newline
- Char is a newline character (10).
- period
- Char counts as the end of a sentence (.,!,?).
- quote
- Char is a quote character (
", ', `).
- paren(Close)
- Char is an opening bracket and Close is its
matching close. Covers the three ASCII bracket pairs
(), []
and {}, plus every Unicode Ps/Pe
pair (about 60 pairs in Unicode 17, including angle, corner, ceiling,
floor, mathematical, ornamental, fullwidth and CJK brackets). The
mapping is reversible: with Close bound, Char
unifies with the matching open.
- quote(Close)
- Char is an opening quotation mark and Close is its
matching close. The ASCII quotes
’, ",
and ‘ have
Close = Char; Unicode Pi/Pf
quote pairs (the guillemets, the standard left/right curly single and
double quotes, and the single/double angle and reversed quotation marks)
have Close different from Char. The mapping is
reversible.
- width(Width)
- Width is the number of columns for fixed-width usage used by Char.
True for all printable characters. Most characters require 1 column. Unicode
combining characters require no space (they follow the base
character). Many Asian characters and the Emojis require 2 columns.
These values are used by stream_property/2
for the
position(-Pos) property.
- prolog_layout
- Char is a Prolog layout character: a member of the
Unicode
Pattern_White_Space set used by read_term/2
to separate tokens. The eleven code points are U+0009..U+000D, U+0020,
U+0085, U+200E, U+200F, U+2028 and U+2029. Locale- independent; pinned
to unicode_syntax_version.
prolog_end_of_line is the seven-element line-terminator
subset.
- prolog_end_of_line
- Char ends a line of Prolog source text. Covers the seven
line-terminator-like
Pattern_White_Space code points:
U+000A (LF), U+000B (VT), U+000C (FF), U+000D (CR), U+0085 (NEL), U+2028
(LINE SEPARATOR), and U+2029 (PARAGRAPH SEPARATOR). The same set
terminates % comments and increments the
source line counter. See section
2.15.1.9.
- prolog_var_start
- Char can start a Prolog variable name.
- prolog_atom_start
- Char can start a unquoted Prolog atom that is not a symbol.
- prolog_identifier_continue
- Char can continue a Prolog variable name or atom.
- prolog_symbol
- Char is a Prolog symbol character. Sequences of Prolog symbol
characters glue together to form an unquoted atom. Examples are
=.., \=,
etc.