Inference and Representation
DSGA1005, Fall 2015
Overview
This course covers how to think about and model data. We
introduce the tools of probabilistic graphical models as a
means of representing and manipulating data, modeling
uncertainty, and discovering new insights from data. We
will particularly emphasize latent variable models,
examples of which include latent Dirichlet allocation (for
topic modeling), factor analysis, and Gaussian processes.
The class will also discuss modeling temporal data (e.g.,
hidden Markov models), hierarchical models, deep
generative models, and structured prediction. On
completing this course, students should be able to take a
new problem or data set, formulate an appropriate model,
learn the model from data, and ultimately answer their
original question using inference in the model. 

General information Lecture: Tuesdays, 5:107:00pm, in
Warren Weaver Hall 102
Office hours:
Thursdays, 3:304:30pm. Location:
715 Broadway (intersection with Washington Place),
12th Floor, room 1204 Grading: [not finalized]
problem sets (55%) + midterm exam (20%) + final exam (20%)
+ participation (5%). Problem Set
policy Book (required): Kevin Murphy, Machine Learning: a Probabilistic Perspective, MIT Press, 2012. I recommend the latest (4th) printing, as the earlier editions had many typos. You can tell which printing you have as follows: check the inside cover, below the "Library of Congress" information. If it says "10 9 8 ... 4" you've got the (correct) fourth print.
Piazza: We will use Piazza
to answer questions and post announcements about the
course. Please sign up here.
Students' use of Piazza, particularly for adequately
answering other students' questions, will contribute
toward their participation grade. Online recordings: Most of the lectures and labs'
videos will be posted to techtalks.tv.
Note, however, that class attendance is required. 
Schedule (Readings refer to the Murphy book)
Week  Date  Topic  Readings  Assignments 
1  Sept 8 
Introduction, Bayesian networks [slides] Lab: Probability review, Bayesian network basics, PyMC3 tutorial 
Chapters 1, 2, and 3.53.5.3 (should be
review for most people) Chapter 10 (Bayesian networks), except for 10.2.5 and 10.6 Probabilistic Programming and Bayesian Methods  Chapter 1 Algorithm for dseparation, from Koller & Friedman (optional) 

2  Sept 15 
Case studies #12 Probabilistic modeling in neuroscience (Neil Rabinowitz) [slides] and political science (Pablo Barberá) [slides] Lab: Review of case studies and BN structure learning 
Sec. 17.3, 17.4, 17.5.1 (HMMs) Introduction to Probabilistic Topic Models (optional) Rabinowitz paper (optional) 

3 
Sept 22 
Undirected graphical models [slides] Conditional random fields, Gaussian MRFs Case study #3: Astronomy (Dan ForemanMackey) [slides] Lab: Some subtleties on BNs, MRF review, CRF intro 
Sec. 99.2.5 (exponential
family) Sec. 1919.4.4, 19.6 (MRFs, CRFs) An Introduction to Conditional Random Fields (section 2) Original paper introducing CRFs (optional) Application to camera lens blur (optional) 
ps2 [data] due Oct. 2 at
5pm [Solutions] 
4 
Sept 29 
Exact inference [slides] Variable elimination, treewidth, belief propagation Lab: Graph separation in MRFs, revisiting CRFs, BP, pruning barren nodes 
Chapter 20 Sec. 22.2 (loopy belief propagation) Using loopy BP for stereo vision (optional) 

5 
Oct 6 (no lab this week) 
Unsupervised learning [slides] Expectation Maximization Case study #4: Education (Kevin Wilson, Knewton) 
Sec. 1111.4.4, 11.4.7 (EM) Sec. 17.5.2 (EM for HMMs) The Expectation Maximization Algorithm: A short tutorial Measuring student knowledge (see Section 3; optional) 
ps3 due Oct. 14 at 4pm 
6 
Oct 14 (Weds, 6:108pm in 719 Broadway 12th floor, room 1221) 
MonteCarlo methods [slides 1] [slides 2] Gibbs sampling 
Sec. 23.123.4 Sec. 24.124.4.1 Sec. 17.2.3 
ps4 [data] due Oct. 23 at
5pm [Solutions] 
7 
Oct 20 
Causal inference & Bayesian
additive regression trees [slides] Lab: Midterm review 
Hill
& Gelman Ch. 9 (
p.167175, 181188) Chipman et al's BART paper and bartMachine (optional) Hill's Bayesian Nonparametric Modeling for Causal Inference (optional) Particle filtering for BART (optional) Sec. 26.6 (learning causal DAGS; optional) 

Oct 27 
Midterm exam (in class) 

8 
Nov 3 
Topic modeling [slides] Case study #5: Musical influence via dynamic topic models (Uri Shalit) [slides] Lab: Group discussion on realworld modeling problems 
Chapter 27 (except 27.3.6, 27.3.6.7, 27.6.1, and 27.7) Introduction to Probabilistic Topic Models Modeling musical influence with topic models (optional) 

9 
Nov 10 
Gaussian processes [slides] 
Chapter 15 Application to predicting wind flow (optional) 

10 
Nov 17 
Learning Markov random fields [slides] Moment matching, ChowLiu algorithm, pseudolikelihood Case study #6: Cognitive science (Brenden Lake) [slides] Lab: Exponential families, learning MRFs, and GPs [iPython notebook] 
Sec. 19.5, 26.726.8 Notes on pseudolikelihood An Introduction to Conditional Random Fields (section 4; optional) Approximate maximum entropy learning in MRFs (optional) 
ps5 [data] due Dec. 1 at 3pm 
11 
Nov 24 (no lab this week) 
Variational inference [slides] Meanfield approximation 
Sec. 21.121.6 Sec. 22.322.4 Graphical models, exponential families, and variational inference (optional) 

12  Dec 1 
Learning deep generative models [slides] Stochastic variational inference, Variational autoencoders 
Stochastic Variational
Inference (optional) Talk for an ICML 2014 paper on deep generative models (optional) 
ps6 [data] due Dec. 8 at 3pm 
13  Dec 8 
Structured prediction [slides] Lab: Overview of structured prediction, parametrizing CRFs 
Sec. 19.7 (also, review 19.6
again) Tutorial on Structured Prediction (optional) Original paper introducing structured perceptron (optional) CuttingPlane Training of Structural SVMs (optional) BlockCoordinate FrankWolfe Optimization for Structural SVMs (optional) 
ps7 [data] due Dec. 18 at 5pm 
14  Dec 15 
Integer linear programming [slides] MAP inference, linear programming relaxations, dual decomposition Lab: Review for final 
Sec. 22.6 (MAP inference) Derivation relating dual decomposition & LP relaxations Introduction to Dual Decomposition for Inference (optional) Integer Programming for Bayesian Network Structure Learning (optional) 

Dec 22 (regular class time) 
Final exam 
Reference materials

Problem
Set policy I expect you to try solving each problem set on your own. However, when being stuck on a problem, I encourage you to collaborate with other students in the class, subject to the following rules:
