Naive Bayes

The Naive Bayes approach is a simple probabilistic approach to classification, which assumes independent attributes. More precisely, the Naive Bayes approach aims at modelling the conditional distribution

P(C|A1,...,An)

of a class C and some attributes A1,...,An. In general, when the number n of attributes gets large, a direct specifification of P(C|A1,...,An) - for instance using a table - is not feasible. To come up with a solution, let us first apply Bayes' rule:

P(C|A1,...,An) = P(C) * P(A1,...,An|C) / P(A1,...,An)

Because the probability P(A1,...,An) is the same for all classes c, we can drop it from considerations. Using the chain rule of probabilities, this yields

P(C|A1,...,An) = P(C) * Prod P(Ai|Ai-1,,...,A1,C).

Now, the Naive Bayes approach assume all Ai's to be independent given the class variable C. That is

P(Ai|Ai-1,...,An,C) = P(Ai|C).

Plugging everything together results in

P(C|A1,...,An) = P(C) * Prod P(Ai|C).

Although the strong independence assumptions are typically violated in practice, the naive Bayes classifier performs surprisingly well in practice.

The Naive Bayes approach can elegantly be examplified and graphically represented using Bayesian networks.