6.438 Fall 2011 Recitation Highlights (alpha)

This is an experimental webpage for Khan-Academy-style video highlights of select recitations. Answers to some popular questions from recitations or from office hours may appear here as well.

Warning: The material here is a bit informal, will be posted with slight delay, and is not meant to replace recitations.

Home > Gaussian mixture models: intro and parameter estimation with EM

Remark on initializing EM: Popularly, EM for Gaussian mixture models (GMM's) is initialized using the K-means algorithm, which is very similar to the EM algorithm for GMM's; the only differences:

E-step: K-means finds for each point the closest cluster mean (in Euclidean distance) and assigns the point to this closest cluster--so effectively we are doing a hard assignment of labels rather than a soft assignment. Thus, for K-means, this is called the "assignment" step rather than the E-step.
M-step: Given how K-means finds the closest cluster in the assignment step, we are essentially assuming all cluster covariances to effectively be identity matrices, so we only need to update means and not covariances. For K-means, this is called the "update" step.

So K-means is simpler than the EM algorithm for GMM's and is computationally simpler. It finds cluster assignments for each data point, from which we can compute means and covariances from to initialize EM for GMM's.

Of course, the K-means algorithm itself needs to be initialized! A simple way to initialize is to just randomly choose K of the observed data points as initial cluster means. A more clever way to initialize is the K-means++ algorithm (Arthur and Vassilvitskii 2007).

TA: George H. Chen
Feel free to send feedback via email: geor...@csail.mit.edu