Logic, Language and Learning : Logic, Language and Learning Chapter 16 : Learning Theories
Luc De Raedt
Introduction : Introduction Theory revision
Abduction
Identification in the limit
Program synthesis
Constraints
Theory Induction : Theory Induction Given
a set of examples E
A background theory B
A language L
a covers relation
Find
A theory T such that T U B correctly handles E
The Mutagenicity dataset : The Mutagenicity dataset Example : muta(225) hypothesis Background theory
Theory revision : Theory revision Given
theory T
a set of examples E
a covers relation
Find
Modify T into T’ (which should be close to T) so that T’ correctly handles E
Theory revision : Theory revision In ILP :
Theory T : Prolog/Datalog program
Examples : set of facts / clauses
Coverage : entailment (T |= e)
One way of viewing it (Shapiro)
there is an intended target theory T
this theory has a least herbrand model M(T)
examples (in the form of facts)
are positive if they are in M(T)
are negative otherwise
this is called the intended interpretation
Various settings : Various settings Single / multiple predicates (in E)
Batch (all examples given at the start)
Incremental (one example after the other – intermediate theories need to be induced)
interactive (queries about the classification of examples or the correctness of clauses may be posed to the user)
Functors (programs) or not
Recursion
So far, mainly :
Single predicate/batch/no functors/no recursion
Example 1 : Example 1 T :
grandparent(X,Y) :- parent(X,Z), parent(Z,Y)
parent(jef,paul).
E : { grandparent(jef,an) }
T’ :
assert(T, parent(paul,an))
assert(T, grandparent(jef,an))
Not only induction, also abduction
Uncovered positive
Example 2 : Example 2 T :
flies(X) :- bird(X)
bird(X) :- ostrich(X)
ostrich(oliver)
E : { not flies(oliver) }
T’ :
Retract( T, flies(X) :- bird(X))
Retract(T, bird(X) :- ostrich(X))
Retract(T , ostrich(oliver))
Retract(assert(T, flies(X) :- bird(X), normal(X)), flies(X) :- bird(X))
….
covered negative example
Problems : Problems Interdependencies among clauses
If recursion or multiple predicate learning
Predicate definitions depend on one another
One predicate may be corrected by modifying another one…
Search at the level of SETS of clauses instead of at the level of single clauses
Search space explodes !
Problems : Problems Order dependencies
order of operations retract/assert is important
(unless exhaustive search is possible)
Usually intermediate states are heuristically evaluated
Therefore order of examples (when working incrementally is important)
In case of recursion : first learn base case then recursive clause
Therefore order of learning predicates (when starting from empty theory) is important
Learn easy ones first
Family example : learn aunt and sister …
Recursion and Multiple predicate learning : Recursion and Multiple predicate learning Recursion :
Learn base case before recursive one
E.g.
anc(X,Y) :- parent(X,Y)
anc(X,Y) :- parent(X,Z), anc(Z,Y)
MPL :
Learn simple predicates before more complicated ones (defined in terms of simple ones)
E.g
uncle(X,Y) :- brother(X,Z),parent(Z,Y)
brother(X,Y) :- parent(W,Y),parent(W,X), male(X).
Techniques : Techniques
While T is not correct on E do
If T does not cover a given positive example e in E
Then generalize T wrt e
If T does cover a given negative example e in E
Then specialize T wrt e
Abduction : Abduction What ?
flies(X) :- bird(X)
Given flies(tweety) abduce bird(tweety)
Using Abduction : Using Abduction Abduction for reducing examples to one another
Basic operations : Basic operations Uncovered positive :
Use abduction and reduction to other examples
Generalize clause in T
Add clause to T
Add fact to T
Apply generalization to clause in T
Covered negative :
Use abduction and reduction to other examples
Specialize clause in T
Remove clause from T
Remove fact from T
Apply specialization to clause in T
Theory revision : example : Theory revision : example bird(X) :- ostrich(X)
bird(X) :- blackbird(X)
bird(X) :- pigeon(X)
fly(X) :- bird(X)
not fly(oliver)
ostrich(oliver)
abnormal(oliver)
replace fly(X) :- bird(X) by fly(X) :- bird(X), not abnormal(X)
Theory revision : example : Theory revision : example bird(X) :- ostrich(X)
bird(X) :- blackbird(X)
bird(X) :- pigeon(X)
fly(X) :- bird(X), not abnormal(X)
fly(tweety)
abduce blackbird(tweety) or pigeon(tweety)
Program synthesis from examples : Program synthesis from examples Induce prolog programs from examples of their input-output behaviour
Added problem wrt MPL : structured terms
E.g. member(a,[a,b,c]), member(c,[a,b,c])
member(X,[X|Y])
member(X,[U|Z]) :- member(X,Z)
state of the art :
induce quicksort if you already know partition and append
problems with recursion and term structure
too hard to be practical now...
Testing for coverage w.r.t. the intended interpretation : Testing for coverage w.r.t. the intended interpretation Test whether T U c covers e if T is possibly incomplete and inconsistent and c is clause being tested
General let c : p :- q, r, t
Test whether p =e
Test whether q , r , and t are true in M (a model)
Lazy coverage
M = E (only specified examples are known)
Eager coverage
Ask the user whether q , r , and t if not known (user = model, oracle)
Use can possibly instantiate facts q !
Adaptive coverage
Test whether T U c U E |= q , r , and t
Slide21 :
Coverage : Coverage Member example :
E.g. member(a,[a,b,c]), member(c,[a,b,c])
member(X,[X|Y])
member(X,[U|Z]) :- member(X,Z)
Lazy : e2 not covered by c2
Eager : test whether e2 covered by c2
ask whether member(c,[b,c])
Adaptive : e2 covered by c2 (and E)
Generalization frameworks : Generalization frameworks Not only theta subsumption, also inverse resolution, inverse entailment, inverse implication
These more complicated operators/frameworks are useful because of dependencies among clauses (in contrast to theta subsumption they can take into account more than one clause)
Search approaches : Search approaches How to choose the right operations ? And sequences of operations
Exhaustive enumeration of some sort
(e.g. Shapiro’s MIS)
Ask the user (e.g. using eager approach)
Use heuristics
E.g. MDL (minimum description length)
E.g. CIGOL with predicate invention and inverse resolution
E.g. accuracy …
Identification in the limit : Identification in the limit Given
A sequence of examples e1, … en, …
An intended theory T
A learning algorithm A that outputs a hypothesis H1, …., Hn, … after each example ei
Is there for all theories T and sequences E a finite time n after which T = Hi for i> n
Shapiro’s MIS : Shapiro’s MIS MIS : model inference system for program synthesis
Proved various identification in the limit results
If the sequence includes all examples at some point and the eager method is used
Then identification in the limit
The Model Inference System : The Model Inference System
Slide28 :
Use of constraintsin theory revision : Use of constraints in theory revision What are constraints ?
Here :
Clauses p1; …;pn :- q1, …, qm
(p1 or … or pn if q1 and … and qm)
Clause/constraint is satisfied if and only if there is no substitution for which q1, …,qn is true and pi is false (for all i) in the theory
Clause has to be true in least Herbrand model of the theory
Why ?
Constraints are much more expressive than examples
Examples : Examples Consider family domain
false :- grandparent(X,X)
X=Y :- father(X,W), father(Y,W)
male(X); female(X) :- human(X)
Each positive :
p:- true
Each negative :
false :- n
How to handle constraintsin theory revision : How to handle constraints in theory revision Suppose constraint is violated :
there is a substitution for which q1, …,qn is true and all pi are false in the theory
How to resolve ?
At least one of the qi should be false
Or at least one of the pi should be true
Try multiple possibilities or query the user
Or use special types of constraints, e.g. denials of the form false :- p(…),p(…),… if violated p must be specialized
Inducing constraints : Inducing constraints Can we induce constraints from theories/databases ?
Given :
A theory/database/model
A language Lh
Find :
All clauses/constraints in Lh that are satisfied in the theory
Discovering constraints : Discovering constraints Theory / Model / Database
{ human(luc), human(lieve), male(luc), female(lieve) }
L : no constants/one variable
Find :
human(X) :- male(X).
human(X) :- female(X).
female(X); male(X) :- human(X)
false :- male(X), female(X).
Clausal discovery : Clausal discovery Observe :
if h1; …; hn :- b1, … , bm is not satisfied
then there is an answer to
:-b1, …, bm, not h1, … ,not hn
therefore consider refinements :
h; h1; …, hn :- b1, …, bm and
h1; …, hn :- b1, …, bm, b
in all possible ways
(apply optimal refinement operator)
Clausal discovery : Clausal discovery Observe :
if h1; …; hn :- b1, … , bm satisfies T
then refinements
h; h1; …, hn :- b1, …, bm and
h1; …, hn :- b1, …, bm, b
also satisfy T
Claudien : algorithm : Claudien : algorithm Q := { false :- true }
H := {}
while Q is not empty do
delete c from Q
if c satisfies T
then add c to H
else add all refinements of c to Q
Optimizations : Optimizations Remove redundant clauses
If H |= c then c is redundant
Generate each clause at most once
Use optimal refinement operator
Let rho be optimal refinement operator
Then rho*(top) = Lh and each clause in Lh is generated exactly once !
Rho* : recursive closure of rho
Slide38 :
Special cases : Special cases Using the same algorithm but a different language of hypotheses it is possible to induce
Set of functional dependencies that hold in database
Set of multi valued or inclusion dependencies that hold in database
Can often be optimized for (restricted) classes of formulae
Conclusions : Conclusions Theory revision :
Hard problem
Combinatorics
Interdependencies clauses
Order dependencies
Solutions :
Interaction with user
Use of knowledge (constraints)
Possible to induce constraints too