6.438 Fall 2011 Recitation Highlights (alpha)
This is an experimental webpage for Khan-Academy-style video highlights of
select recitations. Answers to some popular questions from recitations or from
office hours may appear here as well.
Warning: The material here is a bit informal, will be
posted with slight delay, and is not meant to replace recitations.
Home >
Gaussian mixture models: intro and parameter estimation with EM
Remark on initializing EM:
Popularly, EM for Gaussian mixture models (GMM's) is initialized using the
K-means algorithm, which is very similar to the EM algorithm for GMM's;
the only differences:
- E-step: K-means finds for each point the closest cluster mean (in
Euclidean distance) and assigns the point to this closest cluster--so
effectively we are doing a hard assignment of labels rather than a soft
assignment. Thus, for K-means, this is called the "assignment" step
rather than the E-step.
- M-step: Given how K-means finds the closest cluster in the
assignment step, we are essentially assuming all cluster covariances to
effectively be identity matrices, so we only need to update means and
not covariances. For K-means, this is called the "update" step.
So K-means is simpler than the EM algorithm for GMM's and is computationally
simpler. It finds cluster assignments for each data point, from which we can
compute means and covariances from to initialize EM for GMM's.
Of course, the K-means algorithm itself needs to be initialized!
A simple way to initialize is to just randomly choose K of the observed data
points as initial cluster means. A more clever way to initialize is the
K-means++ algorithm
(Arthur and Vassilvitskii 2007).
TA:
George H. Chen
Feel free to send feedback via email: geor
...@csail.mit.edu