18.S996: Algorithmic Aspects of Machine Learning
Fall 2013
Modern machine learning systems are often built on top of algorithms that do not have provable guarantees, and it is the subject of debate when and why they work. In this class, we will focus on designing algorithms whose performance we can rigorously analyze for fundamental machine learning problems. We will cover topics such as: spectral clustering, learning mixture models, matrix completion, tensor decomposition, nonnegative matrix factorization, sparse dictionary learning, inference in graphical models and deep learning. Almost all of these problems are computationally hard in the worstcase and so developing an algorithmic theory is about (1) choosing the right models in which to study these problems and (2) developing the appropriate mathematical tools (often from probability, geometry or algebra) in order to analyze the performance of an algorithm.
Announcement 1: Info about Final Projects is available here. Please send me your final report by December 14th. Here is a sample final project by Yufei Zhao
Announcement 2: Here are the lecture notes for the course!
Course Information
 Instructor: Ankur Moitra
Office Hours: send me an email.
 Lecture: Mondays and Wednesdays 9:3011:00, E17122
 Class room update: We will have to stay in the current room, despite my best efforts. Unfortunately there are no big classrooms in the new math building, and the old math building is being renovated.
 Lecture Notes: Lecture notes and/or presentations will be provided.
 Prerequisites: An advanced course in algorithms (6.854/18.415 or equivalent) and probability (6.041/18.440 or equivalent)
 Assessment: Students will be expected to scribe one or two lectures, solve one or two problem sets, and complete a researchoriented final project. This could be either a survey, or original research; students will be encouraged to find connections between the course material and their own research interests.
Course Outline
Here is a tentative outline for the course:
 Nonnegative Matrix Factorization
 Qualitative Comparisons to SVD
 New Algorithms from Algebra
 New Algorithms from Geometry
 Applications to Topic Models
 Tensor Decompositions
 Basic Definitions and Uniqueness
 Perturbation Bounds for Eigendecompositions
 Phylogenetic Reconstruction/HMMs
 Topic Models and Community Discovery
 Independent Component Analysis
 Sparse Representations
 Uncertainty Principles and Uniqueness
 Pursuit Algorithms and their Guarantees
 Compressed Sensing I: Prony's Method
 Compressed Sensing I: Stable Recovery
 Dictionary Learning I: FullRank
 Dictionary Learning II: Incoherent and Overcomplete
 Mixture Models and Clustering
 Gaussian Mixture Models I: WellSeparated
 Gaussian Mixture Models II: Beyond Clustering
 Gaussian Mixture Models III: Tools from Algebraic Geometry
 Matrix Completion
 Exact Recovery I: Nuclear Norm
 Exact Recovery II: Quantum Golfing
 Alternating Minimization
 Sparse PCA
 Miscellaneous
 Open Questions about Graphical Models and Deep Learning
References

A. Anandkumar, D. Foster, D. Hsu, S. Kakade, Y. Liu A Spectral Algorithm for Latent Dirichlet Allocation

S. Arora, R. Ge, R. Kannan, A. Moitra Computing a Nonnegative Matrix Factorization  Provably

S. Arora, R. Ge, A. Moitra Learning Topic Models  Going Beyond SVD

S. Arora, R. Ge, A. Moitra New Algorithms for Learning Incoherent and Overcomplete Dictionaries

M. Belkin, K. Sinha Polynomial Learning of Distribution Families

A. Bhaskara, M. Charikar, A. Vijayaraghavan Uniqueness of Tensor Decompositions with Applications to Polynomial Identifiability

E. Candes, X. Li, Y. Ma, J. Wright Robust Principal Component Analysis?

E. Candes, B. Recht Exact Matrix Completion via Convex Optimization

V. Chandrasekaran, P. Parrilo, A. Willsky Latent Variable Graphical Model Selection via Convex Optimization

V. Chandrasekaran, S. Sanghavi, P. Parrilo, A. Willsky RankSparsity Incoherence for Matrix Decomposition

U. Feige, J. Kilian Heuristics for Semirandom Graph Problems

A. Frieze, M. Jerrum, R. Kannan Learning Linear Transformations

D. Hsu, S. Kakade Learning Mixtures of Spherical Gaussians: Moment Methods and Spectral Decompositions

P. Jain, P. Netrapalli, S. Sanghavi Lowrank Matrix Completion using Alternating Minimization

A. Kalai, A. Moitra, G. Valiant Efficiently Learning Mixtures of Gaussians

A. Kumar, R. Kannan Clustering with Spectral Norm and the kMeans Algorithm

F. McSherry Spectral Partitioning of Random Graphs

A. Moitra, G. Valiant Settling the Polynomial Learnability of Mixtures of Gaussians

A. Moitra, M. Saks A Polynomial Time Algorithm for Lossy Population Recovery

E. Mossel, S. Roch Learning Nonsingular Phylogenies and Hidden Markov Models

R. Ostrovsky, Y. Rabani, L. Schulman, C. Swamy The Effectiveness of LloydType Methods for the kMeans Problem

B. Recht A Simpler Approach to Matrix Completion

D. Spielman, H. Wang, J. Wright Exact Recovery of SparselyUsed Dictionaries

A. Wigderson, A. Yehudayoff Population Recovery and Partial Identification