Inductive Logic Programming (ILP)

Inductive Logic Programming (ILP) is a research field at the intersection of machine learning and logic programming. It aims at a formal framework as well as practical algorithms for inductively learning relational descriptions (in the form of logic programs) from examples and background knowledge.

ILP approaches can be clustered into three main approaches. One can start from short clauses, iteratively adding literals to their bodies as long as they do not become overly specific (top-down approaches); one can start from long clauses, iteratively removing literals until they would become overly general (bottom-up approaches); or, one can follow an hybrid approach mixing topdown and bottom-up searches. Hybrid approaches are usually employed for multiple predicate learning and theory revision.

Top-Down Approaches

Basically, in top-down approaches, hypotheses are generated in a pre-determined order, and then tested against the examples. More precisely, they start with the most general hypothesis, i.e., clauses of the form daughter(C,P) :- true where all arguments are distinct variables. After seeing the first example contradicting the hypothesis, i.e., after seeing the first negative example, the hypothesis is specialized by selecting a clause, which is then specialized typically in three ways: by applying a substitution, by adding a literal, i.e., an atom or its negation to the body, and by adding a new clause which in turn is specialized.

Example

For instance, we can consider daughter(C,P) :- female(C) and daughter(C,P) :- mother(P,C) for further investigations. Several possibilities (and successive specializations) have to be tried before one finds a clause that covers some positive examples but no negative ones such as daughter(C,P) :- female(C), mother(P,C). Because some positive examples are still not covered, we add a new, maximally general clause to the hypothesis and essentially iterate the process as before until all positive examples are covered and no negative example.

To employ the background knowledge, it is often convenient to view it as a logic program B (i.e. a definite clause program) that is provided to the inductive logic programming system and fixed during the learning process. The hypothesis H together with the background theory B should cover all positive and none of the negative examples.

Top-down approaches are for instance often employed by ILP systems that learn from entailment. More precisely, these systems often employ a separate-and-conquer rule-learning strategy.

In an outer loop of the algorithm, they follow a set-covering approach, in which they repeatedly search for a rule covering many positive examples and none of the negative examples. They then delete the positive examples covered by the current clause and repeat this process until all positive examples have been covered. In the inner loop of the algorithm, they typically refine a clause by unifying variables, by instantiating variables to constants, and/or by adding literals to the clause.

ILP systems that learn from interpretations work in a similar fashion as those that learn from entailment.

Bottom-Up Approaches

While top-down approaches successively specialize a very general starting hypothesis, bottom-up approaches successively generalize a very specific hypothesis.

Traversing the Hypotheses Space

Thus, ILP approaches iteratively modify the current hypothesis syntactically and test it against the examples and background theory. The syntactic modifications are done using so-called refinement operators, which make small modifications to a hypothesis.

Refinement Operator

A refinement operator r takes an hypothesis H and gives back a syntactically modified version H' of H.

For clauses, generalization and specialization operators g and s are usually employed, which basically add a literal, unify variables, and ground variables respectively which delete a literal, anti-unify variables, and replace constants with variables.

Example

Specializations of the clause daughter(C,P) :- female(C) include

     daughter(C,ann) :- female(C), mother(C,ann)

by grounding variables,

     daughter(C,C) :- female(C), mother(C,C)

by unifying variables,

     daughter(C,P) :- female(C), mother(C,P)

by adding a literal. Generalizations of daughter(C,rex) :- female(C), mother(C,rex) include

     daughter(C,P) :- female(C), mother(C,P)

by turning constants into variables,

     daughter(C,P) :- female(C), mother(C,P)

by anti-unifying variables,

     daughter(C,P) :- female(C)

by deleting a literal.

Refinement operators on clauses can straightforwardly be extended to logic programs H by defining r(H) = H \ {c} [ r(c)].

Based on refinement operators, ILP systems traverse the hypothesis space H according to some generality notation.

More-General-Than Relation

A hypothesis G is more general than a hypothesis S if all examples covered by S are also covered by G.

Several generality frameworks have been proposed including inverse implication, inverse resolution and inverse entailment. In practice, however, the large majority of ILP systems uses Plotkin's framework of theta-subsumption.

This general search-based approach to ILP is quite inefficient except for relatively restricted induction problems. To overcome the problem, ILP system typically resort to a heuristic function score to direct search. Several heuristics have been developed including Laplace estimates, MDL-based measures, and Bayesian approaches. In addition, ILP systems usually include a stopping criterion that is related to the significance of the heuristic score.

Fighting the High Complexity

It should be stressed that ILP is a difficult problem. Practical ILP systems fight the inherent complexity of the problem by imposing all sorts of constraints, mostly of syntactic in nature. Such constraints include language and search biases, and are sometimes summarized as declarative biases.

Essentially, the main source of complexity in ILP stems from the variables in the clauses. In top-down systems, the branching factor of the specialization operator increases with the number of variables in the clauses. Introducing types for predicates can rule out many substitutions and unifications.

Example

The type definition type(father(person, person)) specifies that both arguments of atoms over father/2 have to be persons.

Furthermore, one can put a bound on the number of distinct variables that can occur in clauses. Mode declarations are another well-known ILP device. They are used to describe input-output behaviour of predicate definitions.

Example

We might specify mode(daughter(+,-)) and mode(father(-, +)), meaning that the + arguments must be instantiated, whereas the - arguments will be bounded to the answer.

Refinement operators can also be used to encode a language bias, since they can be restricted to generate only a subset of the hypotheses language. For instance, refinement operators can easily be modified to generate only constant-free and function-free clauses. Other methods use a kind of grammar construction to explicitly declare the range of acceptable clauses. Lookaheads are an example of a search bias. In some cases, an atom might never be chosen by our algorithm because it will not - in itself - result in a better score. However, such an atom, while not useful in itself, might introduce new variables that make a better coverage possible by adding another atoms later on. It is usually solved by allowing the algorithm to look ahead in the search space. Instead of considering refinements with a single atom, one considers larger refinements consisting of multiple atoms