18.409: Algorithmic Aspects of Machine Learning
Modern machine learning systems are often built on top of algorithms that do not have provable guarantees, and it is the subject of debate when and why they work. In this class, we will focus on designing algorithms whose performance we can rigorously analyze for fundamental machine learning problems. We will cover topics such as: nonnegative matrix factorization, tensor decomposition, sparse coding, learning mixture models, matrix completion and inference in graphical models. Almost all of these problems are computationally hard in the worst-case and so developing an algorithmic theory is about (1) choosing the right models in which to study these problems and (2) developing the appropriate mathematical tools (often from probability, geometry or algebra) in order to rigorously analyze existing heuristics, or to design fundamentally new algorithms.
Announcement: The course is now complete. Thanks for contributing to a great semester! Stay tuned for a new version of the monograph that will be based on new topics we covered this time, as well as several changes in the exposition itself.
Announcement: The Final Project is posted
here and due on May 14th, by email.
Announcement: PSET 2 is posted
here and due April 23rd, after class.
Announcement: PSET 1 is posted
here and due March 5th, after class.
- Instructor: Ankur Moitra
- Lectures: Tuesday and Thursday 9:30-11:00, 37-212
- Prerequisites: A course in algorithms (6.046/18.410 or equivalent) and probability (6.041/18.440 or equivalent). You will need a strong background in algorithms, probability and linear algebra.
- Textbook: We will use this monograph. Lecture notes and/or presentations covering new topics will be provided.
- Office Hours: Tuesday and Thursday 11:00-12:00 (meet after class)
- Assessment: Students will be expected to solve a handful of problem sets, and complete a research-oriented final project. This could be either a survey, or original research; students will be encouraged to find connections between the course material and their own research interests.
Here is a tentative outline for the course:
Nonnegative Matrix Factorization [slides]
- Qualitative Comparisons to SVD
- New Algorithms via Separability
- Applications to Topic Models
Discussion: When does well-posedness lead to better algorithms?
Tensor Decompositions [slides]
- Tensor Rank, Border Rank and the Rotation Problem
- Jennrich's Algorithm and the Generalized Eigenvalue Problem
- Learning HMMs
- Mixed Membership Models and Community Detection
- Cumulants and Independent Component Analysis
C. Hillar and L. Lim. Most Tensor Problems are NP-hard, JACM 2013.
E. Mossel and S. Roch. Learning Nonsingular Phylogenies and Hidden Markov Models, STOC 2005.
A. Anandkumar, D. Foster, D. Hsu, S. Kakade and Y. Liu A Spectral Algorithm for Latent Dirichlet Allocation, NIPS 2012.
A. Anandkumar, R. Ge, D. Hsu and S. Kakade. A Tensor Spectral Approach to Learning Mixed Membership Community Models, COLT 2013.
N. Goyal, S. Vempala and Y. Xiao. Fourier PCA, STOC 2014.
Discussion: When do algorithms rely (too much) on a distributional model?
- Sparse Recovery, Incoherence and Uncertainty Principles
- Alternating Minimization via Approximate Gradient Descent [slides]
- Sum-of-Squares and Noisy Tensor Decomposition
Discussion: When does belief propagation (provably) work?
Learning Mixture Models
- Expectation Maximization
- Clustering in High-Dimensions
- Method of Moments and Systems of Polynomial Equations [slides]
Discussion: Is nature an adversary? And if not, how can we model and exploit that?
Linear Inverse Problems
- Nuclear Norm, Atomic Norm and Matrix Completion
- Alternating Minimization via Principal Angles
- Tensor Prediction and Random CSPs
Discussion: Do we have enough average-case assumptions?