In relational learning or inductive logic programming, one typically induces a set of rules (or clauses). The resulting rule-set then defines a disjunctive hypothesis, since an instance is classified as positive if it satisfies the conditions of one of the rules. On the other hand, a probabilistic model defines a joint probability distribution over a class variable and a set of "attributes" or "features", and the type of model constrains the joint probability distributions that can be represented.
A straightforward but powerful idea to integrate
these two approaches is to interpret the clauses or rules as
"attributes" over which a joint probability distribution can
be defined. Using Naive Bayes as the probabilistic model,
this translates into the statement that
| clauses are independent. |
For more details, we recommend that you read
nFOIL essentially performs a covering approach such as Quinlan's FOIL, in which one feature (in the form of a clause) is learned after the other, until adding further features does not yield improvements. Clauses are combined with Naive Bayes. The search heuristic is based on class conditional likelihood.
More precisely, the main difference between FOIL and nFOIL is that nFOIL does not follow a separate-and-conquer approach. In FOIL, this is possible because the final model is taken as the disjunction of the clauses (every clause covering a certain subset of examples), which has two consequences:
max l(E, C u {c0}) |