Bayesian Networks

Bayesian network are graphical models for joint probability distribution. Consider to model the inheritance of a single gene that determines a person's blood type:

Blood Type Domain

The blood type domain is a genetic model of the inheritance of a single gene that determines a person's blood type. Each person has two copies of the chromosome containing this gene, one inherited from her mother and one inherited from her father. Occasionally, a person is unavailable for testing, and yet because of the clarification of crime, test of paternity, allocation of (frozen) semen etc. it is often necessary that a blood type of the person be estimated. A blood type can still be derived for that person through an examination and analysis of the types of family members.

More formally, a Bayesian network is an augmented, directed acyclic graph, where each node corresponds to a random variable xi and each edge indicates a direct influence among the random variables. It represents the joint probability distribution

P(x1,..., xn) 

over a fixed, finite set {x1,,...,xn} of random variables. Each random variable xi possesses a finite set S(xi) of mutually exclusive states. The following Figure shows the graph of a Bayesian network modelling the blood type example for a particular family:

The familial relationship, which is taken from Jensen's stud farm example, forms the basis for the graph. The network encodes e.g. that Dorothy's blood type is influenced by the genetic information of her parents Ann and Brian. The set of possible states of bt(dorothy) is

S(bt(dorothy)) = {a, b, ab, 0}; 

the set of possible states of pc(dorothy) and mc(dorothy) are

S(pc(dorothy)) = S(mc(dorothy)) = {a, b, 0}.

The same holds for ann and brian. The direct predecessors of a node x, the parents of x are denoted by Pa(x). For instance,

 Pa(bt(ann)) = {pc(ann), mc(ann)}.

A Bayesian network stipulates the following conditional independence assumption

Independece Assumption

Each node xi in the graph is conditionally independent of any subset A of nodes that are not descendants of xi given a joint state of Pa(xi), i.e.
P(xi | A, Pa(xi)) = P(xi | Pa(xi)).

For example, bt(dorothy) is conditionally independent of bt(ann) given a joint state of its parents {pc(dorothy), mc(dorothy)}. Any pair (xi, Pa(xi)) is called the family of xi denoted as Fa(xi), e.g. bt(dorothy)'s family is (bt(dorothy), {pc(dorothy), mc(dorothy)}). Because of the conditional independence assumption, the joint probability density factorizes

P(x1,...,xn) = Prod P(xi|Pa(xi))

Thereby, we associate with each node xi of the graph the conditional probability distribution P(xi | Pa(xi)), denoted as cpd(xi).