Inference and Representation
DS-GA-1005 and CSCI-GA.2569 Fall 2014
Overview
This course covers graphical models, causal inference, and
advanced topics in statistical machine learning. A
graphical model is a probabilistic model, where the
conditional dependencies between the random variables are
specified via a graph. Graphical models provide a flexible
framework for modeling large collections of variables with
complex interactions, as evidenced by their wide domain of
application, including for example machine learning,
computer vision, speech and computational biology. This
course will provide a comprehensive survey of learning and
inference methods in graphical models. |
|||||||||||||||
General information Lecture: Tuesdays, 7:10-9:00pm, in
Warren Weaver Hall 101
Office hours: Tuesdays,
10:30-11:30am. Location:
Center for Data Science, 726 Broadway, 7th Floor Grading: problem sets (55%) + midterm
exam (20%) + final exam (20%) + participation (5%). Problem Set policy Book (required): Kevin Murphy, Machine Learning: a Probabilistic Perspective, MIT Press, 2012. I recommend the latest (4th) printing, as the earlier editions had many typos. You can tell which printing you have as follows: check the inside cover, below the "Library of Congress" information. If it says "10 9 8 ... 4" you've got the (correct) fourth print. Mailing list: To subscribe to the class list, follow instructions here. |
Draft Schedule (Readings refer to the Murphy book)
Week | Date | Topic | Readings | Assignments |
1 | Sept 2 |
Introduction, Bayesian networks [Slides] |
Chapter 10 (Bayesian networks) Sec. 17.3, 17.4, 17.5.1 (HMMs) How to write a spelling corrector (optional) Introduction to Probabilistic Topic Models (optional) |
|
2 | Sept 9 |
Undirected graphical models [Slides] Conditional random fields |
Sec. 9-9.2.5 (exponential family) Sec. 19-19.4.4, 19.6 (MRFs, CRFs) An Introduction to Conditional Random Fields (section 2) Original paper introducing CRFs (optional) Application to camera lens blur (optional) Application to predicting wind flow (optional) |
|
3 |
Sept 16 |
Learning and causal inference [Slides] [BN structure slides] Maximum likelihood learning, structure learning, causal networks |
Sec. 26-26.6 (structure learning &
causal inference) Sec. 2.8.2 (KL-divergence) Overview of causal inference in statistics by Judea Pearl (sec. 3.2; optional) Paper on causal inference used for computational advertising (optional) Science paper on learning gene regulatory networks (optional) Learning causal networks using interventions (optional) |
ps2 due
Sept. 26 at 5pm [Solutions] |
4 |
Sept 23 |
Exact inference [Slides] Variable elimination, treewidth |
Chapter 20 |
|
5 |
Sept 30 |
Belief propagation [Slides] [Notes] |
Sec. 22.2 (loopy belief propagation) Using loopy BP for stereo vision (optional) |
|
6 |
Oct 7 |
Integer linear programming (MAP inference) [Slides] Linear programming relaxations, dual decomposition |
Sec. 22.6 (MAP inference) Derivation relating dual decomposition & LP relaxations Introduction to Dual Decomposition for Inference (optional) Integer Programming for Bayesian Network Structure Learning (optional) |
|
Oct 21 (no class Oct 14, Fall Recess; will hold office hours on 10/14) |
Midterm exam
(in class) |
|
||
7 |
Oct 28 |
Variational inference [Slides] Mean-field approximation |
Sec. 21.1-21.4 Graphical models, exponential families, and variational inference (optional) Stochastic Variational Inference (optional) |
|
8 |
Nov 4 |
Monte-Carlo methods for inference [Slides 1] [Slides
2] Importance sampling, Metropolis Hastings, Gibbs sampling |
Sec. 23.1-23.4 Sec. 17.2.3 Sec. 24.1-24.4 |
|
9 |
Nov 11 |
More variational inference [Slides] TRW, loopy BP |
Sec. 22.3-22.4 |
|
10 |
Nov 18 | Discovering latent variables using the
method-of-moments [Slides 1] [Slides 2] Guest lecture by Daniel Hsu |
A Method of
Moments for Mixture Models and Hidden Markov Models
(Sec. 1 & 2) Method of Moments for Topic Modeling (optional) |
|
11 | Nov 25 | Structured prediction [Slides] |
Sec. 19.7 (also, review 19.6 again) Tutorial on Structured Prediction (optional) Original paper introducing structured perceptron (optional) Cutting-Plane Training of Structural SVMs (optional) Block-Coordinate Frank-Wolfe Optimization for Structural SVMs (optional) |
|
12 | Dec 2 |
Learning Markov networks [Slides 1] [Slides 2] Pseudo-likelihood, deep generative models |
Sec. 19.5, 26.7-26.8 Sec. 27.7, 28.2 Notes on pseudo-likelihood An Introduction to Conditional Random Fields (section 4; optional) Approximate maximum entropy learning in MRFs (optional) |
|
13 | Dec 9 |
Advanced topics in learning Probabilistic programming |
Talk for recent ICML 2014 paper on deep generative models (optional) Probabilistic programming: overview and bibliography (optional) Probabilistic programming: BUGS (optional) Probabilistic programming: Church book (optional) |
|
Dec 16 |
Final exam (in class) |
Reference materials
|
Problem
Set policy I expect you to try solving each problem set on your own. However, when being stuck on a problem, I encourage you to collaborate with other students in the class, subject to the following rules:
|