6.883 Advanced Machine Learning — Learning with Combinatorial Structure
Realworld machine learning tasks frequently involve combinatorial structure. How model, infer or predict with graphs, matchings, hierarchies, informative subsets or other discrete structure underlying the data? This graduate course explores mathematical models, overarching concepts and algorithmic techniques for solving such problems efficiently. A focus will lie on understanding connections between machine learning, suitable representations, and convex and combinatorial optimization.
Generics
Tentative List of Topics
Applications
structured prediction
combinatorial norms, (structured) sparse and lowrank representations
modeling diversity, informativeness and influence
selected graph and network problems
Overarching concepts and representations
polyhedral representations: convex hulls and combinatorial polyhedra
submodular functions and matroids
determinantal representations: diversity, repulsion and random trees
Algorithms
submodular optimization: relaxations, convexity, greedy methods
nonsmooth convex optimization
conditional gradients everywhere
scaling to larger problems, splitting methods, online and stochastic methods
statistical and computational tradeoffs
Lectures
Disclaimer: these lecture notes have been checked only loosely, and may contain typos or errors. If you find any, please let me know. If you find those notes useful, I'd be happy to hear about that too. Not all lectures were scribed, so the below only covers parts of the course.
Lecture 1: overview
Lecture 2: convex analysis recap
Lecture 3: submodularity definition, structured prediction
Lecture 4: structured prediction, subgradients and subgradient method
Lecture 5: structured prediction, subgradient method, cutting planes
Lecture 6: graphical model inference and perfect graphs
Lecture 7: extensions of submodular functions, logsubmodularity in graphical models
Lecture 8: base polytopes and matroids, submodular minimization problem and its dual
Lecture 9: parametric submodular minimization, minimumm norm point problem
Lecture 10: applications of the minimum norm point problem; dual decomposition and splitting
Lecture 11: sparse regression
Lecture 12: structured sparsity and convex relaxation
Lecture 13: proximal gradient, atomic norms
Lecture 14: FrankWolfe algorithm
Lecture 15: submodular maximization: applications, greedy algorithm
Lecture 16: submodular maximization: continuous greedy
Lecture 17: maximizing nonmonotone submodular functions
Lecture 18: scaling the greedy algorithm
Lecture 19: (combinatorial) online learning
Lecture 20: online mirror descent, multiarmed bandit problem
Lecture 21: determinantal point processes
Lecture 22: determinantal point processes
