Inference and Representation, Fall 2015

Inference and Representation

DS-GA-1005, Fall 2015

Overview

This course covers how to think about and model data. We introduce the tools of probabilistic graphical models as a means of representing and manipulating data, modeling uncertainty, and discovering new insights from data. We will particularly emphasize latent variable models, examples of which include latent Dirichlet allocation (for topic modeling), factor analysis, and Gaussian processes. The class will also discuss modeling temporal data (e.g., hidden Markov models), hierarchical models, deep generative models, and structured prediction. On completing this course, students should be able to take a new problem or data set, formulate an appropriate model, learn the model from data, and ultimately answer their original question using inference in the model.

Prerequisite: DS-GA-1003 (Machine Learning and Computational Statistics), or permission of instructor.

General information

Lecture: Tuesdays, 5:10-7:00pm, in Warren Weaver Hall 102
Recitation/Laboratory (required for all students): Wednesdays, 7:10-8:00pm in Warren Weaver Hall 102

Instructor:
David Sontag
dsontag {@ | at} cs.nyu.edu

Lab instructor:
Rachel Hodos

hodos {@ | at} cims.nyu.edu

Graders:

Prasoon Goyal
Justin Chiu

pg1338 {@ | at} nyu.edu
jchiu {@ | at} cs.nyu.edu

Office hours: Thursdays, 3:30-4:30pm. Location: 715 Broadway (intersection with Washington Place), 12th Floor, room 1204

Grading: [not finalized] problem sets (55%) + midterm exam (20%) + final exam (20%) + participation (5%). Problem Set policy

Book (required): Kevin Murphy, Machine Learning: a Probabilistic Perspective, MIT Press, 2012. I recommend the latest (4th) printing, as the earlier editions had many typos. You can tell which printing you have as follows: check the inside cover, below the "Library of Congress" information. If it says "10 9 8 ... 4" you've got the (correct) fourth print.

Piazza: We will use Piazza to answer questions and post announcements about the course. Please sign up here. Students' use of Piazza, particularly for adequately answering other students' questions, will contribute toward their participation grade.

Online recordings: Most of the lectures and labs' videos will be posted to techtalks.tv. Note, however, that class attendance is required.

Schedule (Readings refer to the Murphy book)

Week	Date	Topic	Readings	Assignments
1	Sept 8	Introduction, Bayesian networks [slides] Lab: Probability review, Bayesian network basics, PyMC3 tutorial	Chapters 1, 2, and 3.5-3.5.3 (should be review for most people) Chapter 10 (Bayesian networks), except for 10.2.5 and 10.6 Probabilistic Programming and Bayesian Methods -- Chapter 1 Algorithm for d-separation, from Koller & Friedman (optional)	ps1 due Sept. 22 at 3pm [Solutions]
2	Sept 15	Case studies #1-2 Probabilistic modeling in neuroscience (Neil Rabinowitz) [slides] and political science (Pablo Barberá) [slides] Lab: Review of case studies and BN structure learning	Sec. 17.3, 17.4, 17.5.1 (HMMs) Introduction to Probabilistic Topic Models (optional) Rabinowitz paper (optional)
3	Sept 22	Undirected graphical models [slides] Conditional random fields, Gaussian MRFs Case study #3: Astronomy (Dan Foreman-Mackey) [slides] Lab: Some subtleties on BNs, MRF review, CRF intro	Sec. 9-9.2.5 (exponential family) Sec. 19-19.4.4, 19.6 (MRFs, CRFs) An Introduction to Conditional Random Fields (section 2) Original paper introducing CRFs (optional) Application to camera lens blur (optional)	ps2 [data] due Oct. 2 at 5pm [Solutions]
4	Sept 29	Exact inference [slides] Variable elimination, treewidth, belief propagation Lab: Graph separation in MRFs, revisiting CRFs, BP, pruning barren nodes	Chapter 20 Sec. 22.2 (loopy belief propagation) Using loopy BP for stereo vision (optional)
5	Oct 6 (no lab this week)	Unsupervised learning [slides] Expectation Maximization Case study #4: Education (Kevin Wilson, Knewton)	Sec. 11-11.4.4, 11.4.7 (EM) Sec. 17.5.2 (EM for HMMs) The Expectation Maximization Algorithm: A short tutorial Measuring student knowledge (see Section 3; optional)	ps3 due Oct. 14 at 4pm
6	Oct 14 (Weds, 6:10-8pm in 719 Broadway 12th floor, room 1221)	Monte-Carlo methods [slides 1] [slides 2] Gibbs sampling	Sec. 23.1-23.4 Sec. 24.1-24.4.1 Sec. 17.2.3	ps4 [data] due Oct. 23 at 5pm [Solutions]
7	Oct 20	Causal inference & Bayesian additive regression trees [slides] Lab: Midterm review	Hill & Gelman Ch. 9 ( p.167-175, 181-188) Chipman et al's BART paper and bartMachine (optional) Hill's Bayesian Nonparametric Modeling for Causal Inference (optional) Particle filtering for BART (optional) Sec. 26.6 (learning causal DAGS; optional)
	Oct 27	Midterm exam (in class)
8	Nov 3	Topic modeling [slides] Case study #5: Musical influence via dynamic topic models (Uri Shalit) [slides] Lab: Group discussion on real-world modeling problems	Chapter 27 (except 27.3.6, 27.3.6.7, 27.6.1, and 27.7) Introduction to Probabilistic Topic Models Modeling musical influence with topic models (optional)
9	Nov 10	Gaussian processes [slides]	Chapter 15 Application to predicting wind flow (optional)
10	Nov 17	Learning Markov random fields [slides] Moment matching, Chow-Liu algorithm, pseudo-likelihood Case study #6: Cognitive science (Brenden Lake) [slides] Lab: Exponential families, learning MRFs, and GPs [iPython notebook]	Sec. 19.5, 26.7-26.8 Notes on pseudo-likelihood An Introduction to Conditional Random Fields (section 4; optional) Approximate maximum entropy learning in MRFs (optional)	ps5 [data] due Dec. 1 at 3pm
11	Nov 24 (no lab this week)	Variational inference [slides] Mean-field approximation	Sec. 21.1-21.6 Sec. 22.3-22.4 Graphical models, exponential families, and variational inference (optional)
12	Dec 1	Learning deep generative models [slides] Stochastic variational inference, Variational auto-encoders	Stochastic Variational Inference (optional) Talk for an ICML 2014 paper on deep generative models (optional)	ps6 [data] due Dec. 8 at 3pm
13	Dec 8	Structured prediction [slides] Lab: Overview of structured prediction, parametrizing CRFs	Sec. 19.7 (also, review 19.6 again) Tutorial on Structured Prediction (optional) Original paper introducing structured perceptron (optional) Cutting-Plane Training of Structural SVMs (optional) Block-Coordinate Frank-Wolfe Optimization for Structural SVMs (optional)	ps7 [data] due Dec. 18 at 5pm
14	Dec 15	Integer linear programming [slides] MAP inference, linear programming relaxations, dual decomposition Lab: Review for final	Sec. 22.6 (MAP inference) Derivation relating dual decomposition & LP relaxations Introduction to Dual Decomposition for Inference (optional) Integer Programming for Bayesian Network Structure Learning (optional)
	Dec 22 (regular class time)	Final exam

Acknowledgements: Many thanks to TTI-C, CMU, Hebrew University, UC Berkeley, and Stanford University for sharing material used in slides and homeworks

Reference materials

Optional course text: Probabilistic Graphical Models: Principles and Techniques by Daphne Koller and Nir Friedman, MIT Press (2009).
Probabilistic Programming and Bayesian Methods for Hackers by Cam Davidson Pilon
Machine learning books

Trevor Hastie, Rob Tibshirani, and Jerry Friedman, Elements of Statistical Learning, Second Edition, Springer, 2009. (Can be downloaded as PDF file.)6
David Barber, Bayesian Reasoning and Machine Learning, Cambridge University Press, 2012. (Can be downloaded as PDF file.)

Probability

Chapter 2 of Murphy
Review notes from Stanford's machine learning class
Sam Roweis's probability review

Optimization

Convex Optimization by Stephen Boyd and Lieven Vandenberghe. (Can be downloaded as PDF file.)

Problem Set policy

I expect you to try solving each problem set on your own. However, when being stuck on a problem, I encourage you to collaborate with other students in the class, subject to the following rules:

You may discuss a problem with any student in this class, and work together on solving it. This can involve brainstorming and verbally discussing the problem, going together through possible solutions, but should not involve one student telling another a complete solution.
Once you solve the homework, you must write up your solutions on your own, without looking at other people's write-ups or giving your write-up to others.
In your solution for each problem, you must write down the names of any person with whom you discussed it. This will not affect your grade.
Do not consult solution manuals or other people's solutions from similar courses.

Late submission policy: During the semester you are allowed at most two extensions on the homework assignment. Each extension is for at most 48 hours and carries a penalty of 25% off your assignment.