LIFTCOVER Manual¶
Introduction¶
Predicate Reference¶
Installation¶
LIFTCOVER is distributed as a pack of SWI-Prolog. To install it, use
$ swipl
?- pack_install(liftcover).
Requirements¶
It uses the packs
They are installed automatically when installing pack liftcover or can be installed manually as follows
$ swipl
?- pack_install(lbfgs).
?- pack_install(auc).
Pack lbfgs is optional, if absent the versions of the algorithms that use lbfgs do not work but the other versions work.
Parameter learning can be performed using Python and the Python packages numpy, torch and cupy, either on CPU or GPU. If they are not installed, the algorithms that use them do not work, but you can still run the Prolog version of parameter learning.
You can upgrade the pack with
$ swipl
?- pack_upgrade(liftcover).
Note that the packs on which liftcover depends are not upgraded automatically. In this case, they need to be upgraded manually.
Example of use¶
$ cd <pack>/liftcover/prolog/examples
$ swipl
?- [muta].
?- induce_par_lift([1,2,3,4,5,6,7,8,9],P).
Testing the installation¶
$ swipl
?- [library(test_liftcover)].
?- test.
Syntax¶
LIFTCOVER learns liftable probabilistic logic programs a restricted version of Logic programs with annotated disjunctions LPADs ([VV03, VVB04]). A HPLP is a set of annotated clauses whose head contain a single atom annotated with a probability. For the rest, the usual syntax of Prolog is used.
A clause in such a program has a single head with the target predicate (the predicate you want to learn) and a body composed of input predicates or predicates defined by deterministic clauses (typically facts).
A general HPLP clause has the following form:
h:p:- b1,b2...,bn.
The following example inspired from the UWCSE dataset used in [KD05] is represented as (file uwcse.pl)
advisedby(A,B):0.3 :-student(A),professor(B),project(A,C),project(B,C).
advisedby(A,B):0.6 :-student(A),professor(B),ta(C,A),taughtby(C,B).
student(harry). professor(ben).
project(harry,pr1). project(harry,pr2).
project(ben,pr1). project(ben,pr2).
taughtby(c1,ben). taughtby(c2,ben).
ta(c_1,harry). ta(c_2,harry).
where publication(A,B,C)
means that A is a publication with author B produced in project C.
advisedby/2
is the target predicate and student/1, professor/1, project/2, ta/2, taughtby/2
are input predicates.
- The first clause states that a student A is advised by a professor B with probability 0.3 if they both
work on the same project C.
- The second states that a student A is advised by a professor B with probability 0.6 if the student
is a teacher assistant in a course taught by the professor.
- The facts in the program state that harry is a student and ben a professor.
They have two joint courses c1 and c2, two joint projects pr1 and pr2.
Semantics¶
The semantics of liftable PLP directly inherits the semantics of LPADs.
Learning¶
Input¶
To execute the learning algorithms, prepare a Prolog file divided in five parts
preamble
background knowledge, i.e., knowledge valid for all interpretations
liftable PLP for which you want to learn the parameters (optional)
language bias information
example interpretations
The preamble must come first, the order of the other parts can be changed.
For example, consider the Bongard problems of [DRVL95]. bongard.pl and bongardkeys.pl represent a Bongard problem for LIFTCOVER.
Preamble¶
In the preamble, the LIFTCOVER library is loaded with (bongard.pl):
:- use_module(library(liftcover)).
Now you can initialize with
:- lift.
At this point you can start setting parameters for SLEAHP such as for example
:- set_lift(megaex_bottom,10).
:- set_lift(min_probability,0.00001).
:- set_lift(verbosity,1).
We will later see the list of available parameters.
Background and Initial hierarchical program¶
Now you can specify the background knowledge with a fact of the form
bg(<list of terms representing clauses>).
where the clauses must be deterministic.
Alternatively, you can specify a set of clauses by including them in a section between :- begin_bg.
and :- end_bg.
Moreover, you can specify an initial program with a fact of the form
in(<list of terms representing clauses>).
The initial program is used in parameter learning for providing the structure.
Remember to enclose each clause in parentheses because :-
has the highest precedence.
For example, bongard.pl has the initial program
in([(pos:0.197575 :-
circle(A),
in(B,A)),
(pos:0.000303421 :-
circle(A),
triangle(B)),
(pos:0.000448807 :-
triangle(A),
circle(B))]).
Alternatively, you can specify an input program in a section between :- begin_in.
and :- end_in.
as for example
:- begin_in.
pos:0.197575 :-
circle(A),
in(B,A).
pos:0.000303421 :-
circle(A),
triangle(B).
pos:0.000448807 :-
triangle(A),
circle(B).
:- end_in.
If you specify both a in/1
fact and a section, the clauses of the two will be combined.
The annotations of the head atoms of the initial program can be probabilities, as in the example above. In parameter learning, the learning procedure can start with the initial parameters in the program. In this case, it is up to the user to ensure that there are values between 0 and 1. In the case of structure learning, the initial program is not necessary.
Language Bias¶
The language bias part contains the declarations of the input and output predicates. The output predicate is declared as
output(<predicate>/<arity>).
and indicates the predicate whose atom you want to predict (target predicate). Derivations for the atoms for this predicate in the input data is built by the system. Input predicates are those whose atoms you are not interested in predicting. You can declare input predicates with
input(<predicate>/<arity>).
For these predicates, the only true atoms are those in the interpretations and those derivable from them using the background knowledge.
Then, for structure learning you have to specify the language bias by means of mode declarations in the style of Progol.
modeh(<recall>,<predicate>(<arg1>,...)).
specifies the atoms that can appear in the head of clauses.
modeb(<recall>,<predicate>(<arg1>,...)).
specifies the atoms that can appear in the body of clauses. <recall>
can be an integer or *
. <recall>
indicates how many atoms for the predicate specification are retained in the bottom clause during a saturation step. *
stands for all those that are found.
Otherwise the indicated number of atoms are randomly chosen.
Arguments of the form
+<type>
specifies that the argument should be an input variable of type <type>
, i.e., a variable replacing a +<type>
argument in the head or a -<type>
argument in a preceding literal in the current hypothesized clause.
Another argument form is
-<type>
for specifying that the argument should be a output variable of type <type>
.
Any variable can replace this argument, either input or output.
The only constraint on output variables is that those in the head of a clause must appear as output variables in an atom in the body.
Other forms are
#<type>
for specifying an argument which should be replaced by a constant of type <type>
in the bottom clause but should not be used for replacing input variables of the following literals when building the bottom clause or
-#<type>
for specifying an argument which should be replaced by a constant of type <type>
in the bottom clause and that should be used for replacing input variables of the following literals when building the bottom clause.
<constant>
for specifying a constant.
An example of language bias for the Bongard domain is
output(pos/0).
input(triangle/1).
input(square/1).
input(circle/1).
input(in/2).
input(config/2).
modeh(*,pos).
modeb(*,triangle(-obj)).
modeb(*,square(-obj)).
modeb(*,circle(-obj)).
modeb(*,in(+obj,-obj)).
modeb(*,in(-obj,+obj)).
modeb(*,config(+obj,-#dir)).
LIFTCOVER also requires facts for the determination/2
Aleph-style predicate that indicate which predicates can appear in the body of clauses.
For example
determination(pos/0,triangle/1).
determination(pos/0,square/1).
determination(pos/0,circle/1).
determination(pos/0,in/2).
determination(pos/0,config/2).
state that triangle/1
can appear in the body of clauses for pos/0
.
Example Interpretations¶
The last part of the file contains the data.
You can specify data with two modalities: models and keys.
In the models case, you specify an example model (or interpretation or mega-example) as a list of Prolog facts initiated by begin(model(<name>)).
and terminated by end(model(<name>)).
as in
begin(model(2)).
pos.
triangle(o5).
config(o5,up).
square(o4).
in(o4,o5).
circle(o3).
triangle(o2).
config(o2,up).
in(o2,o3).
triangle(o1).
config(o1,up).
end(model(2)).
The facts in the interpretation are loaded in SWI-Prolog database by adding an extra initial argument equal to the name of the model.
After each interpretation is loaded, a fact of the form int(<id>)
is asserted, where id
is the name of the interpretation.
This can be used in order to retrieve the list of interpretations.
Alternatively, with the keys modality, you can directly write the facts and the first argument will be interpreted as a model identifier. The above interpretation in the keys modality is
pos(2).
triangle(2,o5).
config(2,o5,up).
square(2,o4).
in(2,o4,o5).
circle(2,o3).
triangle(2,o2).
config(2,o2,up).
in(2,o2,o3).
triangle(2,o1).
config(2,o1,up).
which is contained in the bongardkeys.pl.
This is also how model 2
above is stored in SWI-Prolog database.
The two modalities, models and keys, can be mixed in the same file.
Facts for int/1
are not asserted for interpretations in the key modality but can be added by the user explicitly.
Note that you can add background knowledge that is not probabilistic directly to the file writing clauses taking into account the model argument.
For example (carc.pl
), contains
connected(_M,Ring1,Ring2):-
Ring1 \= Ring2,
member(A,Ring1),
member(A,Ring2), !.
symbond(Mod,A,B,T):- bond(Mod,A,B,T).
symbond(Mod,A,B,T):- bond(Mod,B,A,T).
where the first argument of all atoms is the model.
Then you must indicate how examples are divided in folds with facts of the form: fold(<fold_name>,<list of model identifiers>)
, as for example
fold(train,[2,3,...]).
fold(test,[490,491,...]).
As the input file is a Prolog program, you can define intentionally the folds as in
fold(all,F):-
findall(I,int(I),F).
fold/2
is dynamic so you can also write
:- fold(all,F),
sample_lift(4,F,FTr,FTe),
assert(fold(rand_train,FTr)),
assert(fold(rand_test,FTe)).
which however must be inserted after the input interpretations otherwise the facts for int/1
will not be available and the fold all
would be empty.
This command uses sample_lift(N,List,Sampled,Rest)
exported from liftcover
that samples N
elements from List
and returns the sampled elements in Sampled
and the rest in Rest
.
If List
has N
elements or less, Sampled
is equal to List
and Rest
is empty.
Commands¶
Parameter Learning¶
To execute LIFTCOVER, prepare an input file as indicated above and call
?- induce_lift_par(+List_of_folds:list,-P:list).
where <list of folds>
is a list of the folds for training and P
will contain
the input program with updated parameters.
For example bongard.pl, you can perform parameter learning on the train
fold with
?- induce_lift_par([train],P).
The algorithm that is used for parameter learning is specified by
the parameter parameter_learning
that can be set to
em
, for Expectation Maximization, in Prologem_python
, for Expectation Maximization, in Python, using eithercpu
orgpu
, depending on the value of theprocessor
parameter. In the first case, numpy is used, in the latter case cupy is used.gd
, for Gradient Descent, in Prologgd_python
, for Gradient Descent, in Python, using Pytorch, using eithercpu
orgpu
, depending on the value of theprocessor
parameter. In both cases, torch is used.lbfgs
for Limited-memory Broyden-Fletcher-Goldfarb-Shanno, in Prolog and C
Structure Learning¶
To execute LIFTCOVER, prepare an input file in the editor panel as indicated above and call
?- induce_lift(+List_of_folds:list,-P:list).
where List_of_folds
is a list of the folds for training and P
will contain the learned program.
For example bongard.pl, you can perform structure learning on the train
fold with
?- induce_lift([train],P).
A program can also be tested on a test set with test_lift/7
as described below.
Helper Predicates¶
sort_rules(+RulesIn:list_of_rules,-RulesOut:list_of_rules) is det
The predicate sorts RulesIn
according to the probability of the rules.
filter_rules(+RulesIn:list_of_rules,-RulesOut:list_of_rules,+Min_prob:float) is det
The predicate removes the rules with a probability below or equal to Min_prob
.
filter_rules(:RulesIn:list_of_rules,-RulesOut:list_of_rules) is det
The predicate removes the rules with a probability below or equal to the min_prob
parmeter.
remove_zero(+RulesIn:list_of_rules,-RulesOut:list_of_rules) is det
The predicate removes the rules with a probability of 0.0.
Testing¶
A program can also be tested on a test set in LIFTCOVER with
test_lift(+Program:list,+List_of_folds:list,-LL:float,-AUCROC:float,-ROC:list,-AUCPR:float,-PR:list) is det
where Program
is a list of terms representing clauses and List_of_folds
is a list of folds.
test_lift/7
returns the log likelihood of the test examples in LL
, the Area Under the ROC curve in AUCROC
, a dictionary containing the list of points (in the form of Prolog pairs x-y
) of the ROC curve in ROC
, the Area Under the PR curve in AUCPR
, a dictionary containing the list of points of the PR curve in PR
.
Then you can draw the curves using C3.js as follows
compute_areas_diagrams(+ExampleList:list,-AUCROC:float,-ROC:dict,-AUCPR:float,-PR:dict) is det
(from pack auc.pl) that takes as input a list ExampleList
of pairs probability-literal.
For example, to test on fold test
the program learned on fold train
you can run the query
?- induce_lift_par([train],P),
test_lift(P,[test],LL,AUCROC,ROC,AUCPR,PR).
Or you can test the input program on the fold test
with
?- in(P),test_lift(P,[test],LL,AUCROC,ROC,AUCPR,PR).
In SWISH, by including
:- use_rendering(c3).
:- use_rendering(lpad).
in the code before :- lift.
the curves will be shown as graphs using C3.js and the output program will be pretty printed.
Predicates
prob_lift(:At:atom,-P:float) is multi
prob_lift(:At:atom,+Program:probabilistic_program,-P:float) is multi
compute the probability of atom At
given by the
input program the first and by Program
the latter.
The first argument of At
should be the model name.
If At
contains variables, the predicate returns
all the instantiaions of At=
with their probabilities in backtracking.
For example
?- prob_lift(pos(2),P).
or
?- prob_lift(christian_religion(f1,C),P).
C = 'AMSA',
P = 0.409107 ;
C = 'AUS',
P = 0.409107
...
Predicates
ranked_answers(:At:atom,+Var:var,-RankedAnswers:list) is multi
ranked_answers(:At:atom,+Var:var,+Program:probabilistic_program,-RankedAnswers:list) is multi
return a list of answers for the query At=.
:code:`Var
should be a variable in At
. RankedAnswers
is a list of pairs
(P-A)
where P
is the probability of the answer At{Var/A}
.
The list is sorted in decreasing order of probability.
The first argument of At `should be the model name.
The query is asked to the input program for :code:`ranked_answers/3
and to the
given progarm for ranked_answers/4
.
Predicates
explain_lift(:At:atom,-Exp:list) is multi
explain_lift(:At:atom,+Program:probabilistic_program,-Exp:list) is multi1
returns the explanation of atom At
given by the
input program. The first argument of At
should be the model name.
The explanation is a list of pairs (P-Ex)
where P
is the probability
in the head of a rule H:P:-B
and Ex
is a true grounding of B
.
The query is asked to the input program for explain_lift/2
and to the
given progarm for explain_lift/3
.
Hyper-parameters for Learning¶
Hyper-parameters are set with commands of the form
:- set_lift(<parameter>,<value>).
and read with commands of the form
:- setting_lift(<parameter>,<value>).
hyper-parameters
parameter_learning
: (values:{em,em_python,gd,gd_python,lbfgs}
, default value:em
) parameter learning algorithmsingle_var
(values:{true,false}
, default value:false
): if set totrue
, there is a random variable for each clause, instead of a different random variable for each grounding of each clauseprocessor
: (values:{cpu,gpu}
, default value:cpu
) which processor Python will use for parameter learningregularization
: (values:{no,l1,l2,bayes}
, default value:l1
) type of regularizationgamma
(values: real number, default value:10
): regularization coefficient for L1 and L2ab
(values: list of two real numbers, default value:[0,10]
): values of a and b for bayesian regularizationeta
(values: real number, default value:0.01
): eta parameter in gradient descent (the parameters are updated as par=par+eta*gradient)max_initial_weight
(values: real number , default value: 0.5): weights in lbfgs and gd are randomly initialized with values in the interval [-max_initial_weight, max_initial_weight].min_probability
(values: real number in[0,1]
, default value:1e-5
): probability threshold under which a clause is dropped out.eps
(values: real, default value: 0.0001): if the difference in the log likelihood in two successive parameter learning iterations is smaller thaneps
, then parameter learning stops.eps_f
(values: real, default value: 0.00001): if the difference in the log likelihood in two successive parameter learning iterations is smaller thaneps_f*(-current log likelihood)
, then LIFTCOVER stops.random_restarts_number
(values: integer, default value: 1): number of random restarts of parameter learning algorithmsrandom_restarts_number_str_learn
(values: integer, default value: 1): number of random restarts during structure learning for learning the parameter of single clausesiter
(values: integer, default value: -1): maximum number of parameter learning iterations (-1 means not limits)max_iter
(values: integer, default value:10
): iterations of clause search.beamsize
(values: integer, default value: 100): size of the beam in the search for clausesneg_ex
(values:given
,cw
, default value:cw
): if set togiven
, the negative examples in training and testing are taken from the test folds interpretations, i.e., those examplesex
stored asneg(ex)
; if set tocw
, the negative examples in training and testing are generated according to the closed world assumption, i.e., all atoms for target predicates that are not positive examples. The set of all atoms is obtained by collecting the set of constants for each type of the arguments of the target predicate, so the target predicates must have at least one fact formodeh/2
ormodeb/2
also for parameter learning.specialization
: (values:{bottom,mode}
, default value:bottom
) specialization mode.megaex_bottom
(values: integer, default value: 1, valid for SLEAHP): number of mega-examples on which to build the bottom clauses.initial_clauses_per_megaex
(values: integer, default value: 1, valid for SLEAHP): number of bottom clauses to build for each mega-example (or model or interpretation).d
(values: integer, default value: 1, valid for SLEAHP): number of saturation steps when building the bottom clause.max_var
(values: integer, default value: 4): maximum number of distinct variables in a clausemaxdepth_var
(values: integer, default value: 2): maximum depth of variables in clauses (as defined in [Coh95]).max_body_length
(values: integer, default value: 100): maximum number of literals in the body of clausesmax_clauses
(values: integer, default value: 1000): maximum number of clauses in the theoryneg_literals
(values:{true,false}
, default value:false
): whether to consider negative literals when building the bottom clauseminus_infinity
: (values: real, default value: -1.0e20) minus infinitylogzero
(values: negative real, default value \(\log(0.000001)\)): value assigned to \(\log(0)\).zero
(values: real, default value \(0.000001\)): value assigned to \(0\).seed
(values: seed(integer) or seed(random), default valueseed(3032)
): seed for the Prolog random functions, see SWI-Prolog manual .verbosity
(values: integer in[1,4]
, default value:1
): level of verbosity of the algorithms.threads
(values: integer orcpu
, default value:1
): number of threads to use for computing clause statistics in scoring clause refinements and parameter learning. Ifcpu
, the number of threads is equal to the number of cores of the machine.
Example Files¶
The pack/liftcover/prolog/examples
folder in SWI-Prolog home contains some example programs.
The pack/liftcover/docs
folder contains this manual in latex, html and pdf.
Manual in PDF¶
A PDF version of the manual is available at https://friguzzi.github.io/liftcover/_build/latex/liftcover.pdf.
License¶
phil follows the MIT License that you can find in phil root folder. The copyright is by Fabrizio Riguzzi and Arnaud Nguembang Fadja.
- Coh95
William W. Cohen. Pac-learning non-recursive prolog clauses. Artif. Intell., 79(1):1–38, 1995.
- DRVL95
L. De Raedt and W. Van Laer. Inductive constraint logic. In Proceedings of the 6th Conference on Algorithmic Learning Theory (ALT 1995), volume 997 of LNAI, 80–94. Fukuoka, Japan, 1995. Springer.
- GBA+23
Elisabetta Gentili, Alice Bizzarri, Damiano Azzolini, Riccardo Zese, and Fabrizio Riguzzi. Regularization in probabilistic inductive logic programming. In Elena Bellodi, Francesca Alessandra Lisi, and Riccardo Zese, editors, Inductive Logic Programming - ILP 2023, volume 14363 of Lecture Notes in Artificial Intelligence, 16–29. Cham, 2023. Springer Nature Switzerland. doi:10.1007/978-3-031-49299-0_2.
- KD05
Stanley Kok and Pedro Domingos. Learning the structure of markov logic networks. In Proceedings of the 22nd international conference on Machine learning, 441–448. 2005.
- NFR19
Arnaud Nguembang Fadja and Fabrizio Riguzzi. Lifted discriminative learning of probabilistic logic programs. Machine Learning, 108(7):1111–1135, 2019. doi:10.1007/s10994-018-5750-0.
- VV03
J. Vennekens and S. Verbaeten. Logic programs with annotated disjunctions. Technical Report CW386, K. U. Leuven, 2003.
- VVB04
J. Vennekens, S. Verbaeten, and M. Bruynooghe. Logic programs with annotated disjunctions. In International Conference on Logic Programming, volume 3131 of LNCS, 195–209. Springer, 2004.