Inference and Representation
DSGA1005 and CSCIGA.2569 Fall 2014
Overview
This course covers graphical models, causal inference, and
advanced topics in statistical machine learning. A
graphical model is a probabilistic model, where the
conditional dependencies between the random variables are
specified via a graph. Graphical models provide a flexible
framework for modeling large collections of variables with
complex interactions, as evidenced by their wide domain of
application, including for example machine learning,
computer vision, speech and computational biology. This
course will provide a comprehensive survey of learning and
inference methods in graphical models. 

General information Lecture: Tuesdays, 7:109:00pm, in
Warren Weaver Hall 101
Office hours: Tuesdays,
10:3011:30am. Location:
Center for Data Science, 726 Broadway, 7th Floor Grading: problem sets (55%) + midterm
exam (20%) + final exam (20%) + participation (5%). Problem Set policy Book (required): Kevin Murphy, Machine Learning: a Probabilistic Perspective, MIT Press, 2012. I recommend the latest (4th) printing, as the earlier editions had many typos. You can tell which printing you have as follows: check the inside cover, below the "Library of Congress" information. If it says "10 9 8 ... 4" you've got the (correct) fourth print. Mailing list: To subscribe to the class list, follow instructions here. 
Draft Schedule (Readings refer to the Murphy book)
Week  Date  Topic  Readings  Assignments 
1  Sept 2 
Introduction, Bayesian networks [Slides] 
Chapter 10 (Bayesian networks) Sec. 17.3, 17.4, 17.5.1 (HMMs) How to write a spelling corrector (optional) Introduction to Probabilistic Topic Models (optional) 

2  Sept 9 
Undirected graphical models [Slides] Conditional random fields 
Sec. 99.2.5 (exponential family) Sec. 1919.4.4, 19.6 (MRFs, CRFs) An Introduction to Conditional Random Fields (section 2) Original paper introducing CRFs (optional) Application to camera lens blur (optional) Application to predicting wind flow (optional) 

3 
Sept 16 
Learning and causal inference [Slides] [BN structure slides] Maximum likelihood learning, structure learning, causal networks 
Sec. 2626.6 (structure learning &
causal inference) Sec. 2.8.2 (KLdivergence) Overview of causal inference in statistics by Judea Pearl (sec. 3.2; optional) Paper on causal inference used for computational advertising (optional) Science paper on learning gene regulatory networks (optional) Learning causal networks using interventions (optional) 
ps2 due
Sept. 26 at 5pm [Solutions] 
4 
Sept 23 
Exact inference [Slides] Variable elimination, treewidth 
Chapter 20 

5 
Sept 30 
Belief propagation [Slides] [Notes] 
Sec. 22.2 (loopy belief propagation) Using loopy BP for stereo vision (optional) 

6 
Oct 7 
Integer linear programming (MAP inference) [Slides] Linear programming relaxations, dual decomposition 
Sec. 22.6 (MAP inference) Derivation relating dual decomposition & LP relaxations Introduction to Dual Decomposition for Inference (optional) Integer Programming for Bayesian Network Structure Learning (optional) 

Oct 21 (no class Oct 14, Fall Recess; will hold office hours on 10/14) 
Midterm exam
(in class) 


7 
Oct 28 
Variational inference [Slides] Meanfield approximation 
Sec. 21.121.4 Graphical models, exponential families, and variational inference (optional) Stochastic Variational Inference (optional) 

8 
Nov 4 
MonteCarlo methods for inference [Slides 1] [Slides
2] Importance sampling, Metropolis Hastings, Gibbs sampling 
Sec. 23.123.4 Sec. 17.2.3 Sec. 24.124.4 

9 
Nov 11 
More variational inference [Slides] TRW, loopy BP 
Sec. 22.322.4 

10 
Nov 18  Discovering latent variables using the
methodofmoments [Slides 1] [Slides 2] Guest lecture by Daniel Hsu 
A Method of
Moments for Mixture Models and Hidden Markov Models
(Sec. 1 & 2) Method of Moments for Topic Modeling (optional) 

11  Nov 25  Structured prediction [Slides] 
Sec. 19.7 (also, review 19.6 again) Tutorial on Structured Prediction (optional) Original paper introducing structured perceptron (optional) CuttingPlane Training of Structural SVMs (optional) BlockCoordinate FrankWolfe Optimization for Structural SVMs (optional) 

12  Dec 2 
Learning Markov networks [Slides 1] [Slides 2] Pseudolikelihood, deep generative models 
Sec. 19.5, 26.726.8 Sec. 27.7, 28.2 Notes on pseudolikelihood An Introduction to Conditional Random Fields (section 4; optional) Approximate maximum entropy learning in MRFs (optional) 

13  Dec 9 
Advanced topics in learning Probabilistic programming 
Talk for recent ICML 2014 paper on deep generative models (optional) Probabilistic programming: overview and bibliography (optional) Probabilistic programming: BUGS (optional) Probabilistic programming: Church book (optional) 

Dec 16 
Final exam (in class) 
Reference materials

Problem
Set policy I expect you to try solving each problem set on your own. However, when being stuck on a problem, I encourage you to collaborate with other students in the class, subject to the following rules:
