Was quite baffled by
xpath(DOM, //div, DIV), being a syntax error until I realized div is an operator. xpath(DOM, //'div', DIV),
|Did you know ...||Search Documentation:|
:- use_module(library(xpath)).(can be autoloaded)
to select inside an element. First we can construct paths using / and //:
The Terms above are of type callable. The functor specifies
the element name. The element name’*' refers to any element. The
self refers to the top-element itself and is often
used for processing matches of an earlier xpath/3
query. A term NS:Term refers to an XML name in the namespace NS.
Optional arguments specify additional constraints and functions. The
arguments are processed from left to right. Defined conditional argument
last-1is the element directly preceding the last one.
Defined function argument values are:
text, but uses normalize_space/2 to normalise white-space in the output
number, but subsequently transform the value into an integer using the round/1 function.
number, but subsequently transform the value into a float using the float/1 function.
@href(atom)are equivalent. The SGML parser can return attributes as strings using the
In addition, the argument-list can be conditions:
content = contentdefines that the content of the element is the atom
content. The functions
upper_casecan be applied to Right (see example below).
h3element inside a
divelement, where the
divelement itself contains an
h2child with a
This is equivalent to the conjunction of XPath goals below.
..., xpath(DOM, //(div), Div), xpath(Div, h2/strong, _), xpath(Div, h3, Result)
Match each table-row in DOM:
xpath(DOM, //tr, TR)
Match the last cell of each tablerow in DOM. This example illustrates that a result can be the input of subsequent xpath/3 queries. Using multiple queries on the intermediate TR term guarantee that all results come from the same table-row:
xpath(DOM, //tr, TR), xpath(TR, /td(last), TD)
href attribute in an <a>
xpath(DOM, //a(@href), HREF)
Suppose we have a table containing rows where each first column is the name of a product with a link to details and the second is the price (a number). The following predicate matches the name, URL and price:
product(DOM, Name, URL, Price) :- xpath(DOM, //tr, TR), xpath(TR, td(1), C1), xpath(C1, /self(normalize_space), Name), xpath(C1, a(@href), URL), xpath(TR, td(2, number), Price).
Suppose we want to select books with genre="thriller" from a tree
thriller(DOM, Book) :- xpath(DOM, //book(@genre=thiller), Book).
Match the elements
<table align="center"> and
//table(@align(lower) = center)
height of a
element as a number, and the
div node itself:
xpath(DOM, //div(@width(number)=W, @height(number)=H), Div)
div is an infix operator, so parentheses must
be used in cases like the following:
xpath(DOM, //(div), Div)