1. What is nFOIL?

In relational learning or inductive logic programming, one typically induces a set of rules (or clauses). The resulting rule-set then defines a disjunctive hypothesis, since an instance is classified as positive if it satisfies the conditions of one of the rules. On the other hand, a probabilistic model defines a joint probability distribution over a class variable and a set of "attributes" or "features", and the type of model constrains the joint probability distributions that can be represented.

A straightforward but powerful idea to integrate these two approaches is to interpret the clauses or rules as "attributes" over which a joint probability distribution can be defined. Using Naive Bayes as the probabilistic model, this translates into the statement that

clauses are independent.

For more details, we recommend that you read

Some more details

nFOIL essentially performs a covering approach such as Quinlan's FOIL, in which one feature (in the form of a clause) is learned after the other, until adding further features does not yield improvements. Clauses are combined with Naive Bayes. The search heuristic is based on class conditional likelihood.

More precisely, the main difference between FOIL and nFOIL is that nFOIL does not follow a separate-and-conquer approach. In FOIL, this is possible because the final model is taken as the disjunction of the clauses (every clause covering a certain subset of examples), which has two consequences:

These properties do not hold in nFOIL because every clause can affect the likelihood of all examples. Consequently, the nFOIL algorithm is obtained from FOIL by changing two components in FOIL: