Inference and Representation, Fall 2014

Inference and Representation

DS-GA-1005 and CSCI-GA.2569 Fall 2014

Overview

This course covers graphical models, causal inference, and advanced topics in statistical machine learning. A graphical model is a probabilistic model, where the conditional dependencies between the random variables are specified via a graph. Graphical models provide a flexible framework for modeling large collections of variables with complex interactions, as evidenced by their wide domain of application, including for example machine learning, computer vision, speech and computational biology. This course will provide a comprehensive survey of learning and inference methods in graphical models.

Prerequisite: DS-GA-1003/CSCI-GA.2567 (Machine Learning and Computational Statistics), or permission of instructor. Please note that seats are limited and exceptions to the prerequisite are only likely to be granted to PhD students.

General information

Lecture: Tuesdays, 7:10-9:00pm, in Warren Weaver Hall 101
Recitation/Laboratory (required for all students): Thursdays, 5:10-6:00pm in Silver Center 401

Instructor:
David Sontag
dsontag {@ | at} cs.nyu.edu

Lab instructor:
Yacine Jernite

jernite {@ | at} cs.nyu.edu

Grader:

Prasoon Goyal

pg1338 {@ | at} nyu.edu

Office hours: Tuesdays, 10:30-11:30am. Location: Center for Data Science, 726 Broadway, 7th Floor

Grading: problem sets (55%) + midterm exam (20%) + final exam (20%) + participation (5%). Problem Set policy

Book (required): Kevin Murphy, Machine Learning: a Probabilistic Perspective, MIT Press, 2012. I recommend the latest (4th) printing, as the earlier editions had many typos. You can tell which printing you have as follows: check the inside cover, below the "Library of Congress" information. If it says "10 9 8 ... 4" you've got the (correct) fourth print.

Mailing list: To subscribe to the class list, follow instructions here.

Draft Schedule (Readings refer to the Murphy book)

Week	Date	Topic	Readings	Assignments
1	Sept 2	Introduction, Bayesian networks [Slides]	Chapter 10 (Bayesian networks) Sec. 17.3, 17.4, 17.5.1 (HMMs) How to write a spelling corrector (optional) Introduction to Probabilistic Topic Models (optional)	ps1 due Sept. 15 at 10pm [Solutions]
2	Sept 9	Undirected graphical models [Slides] Conditional random fields	Sec. 9-9.2.5 (exponential family) Sec. 19-19.4.4, 19.6 (MRFs, CRFs) An Introduction to Conditional Random Fields (section 2) Original paper introducing CRFs (optional) Application to camera lens blur (optional) Application to predicting wind flow (optional)
3	Sept 16	Learning and causal inference [Slides] [BN structure slides] Maximum likelihood learning, structure learning, causal networks	Sec. 26-26.6 (structure learning & causal inference) Sec. 2.8.2 (KL-divergence) Overview of causal inference in statistics by Judea Pearl (sec. 3.2; optional) Paper on causal inference used for computational advertising (optional) Science paper on learning gene regulatory networks (optional) Learning causal networks using interventions (optional)	ps2 due Sept. 26 at 5pm [Solutions]
4	Sept 23	Exact inference [Slides] Variable elimination, treewidth	Chapter 20	ps3 (data) due Oct. 10 at 5pm
5	Sept 30	Belief propagation [Slides] [Notes]	Sec. 22.2 (loopy belief propagation) Using loopy BP for stereo vision (optional)
6	Oct 7	Integer linear programming (MAP inference) [Slides] Linear programming relaxations, dual decomposition	Sec. 22.6 (MAP inference) Derivation relating dual decomposition & LP relaxations Introduction to Dual Decomposition for Inference (optional) Integer Programming for Bayesian Network Structure Learning (optional)
	Oct 21 (no class Oct 14, Fall Recess; will hold office hours on 10/14)	Midterm exam (in class)
7	Oct 28	Variational inference [Slides] Mean-field approximation	Sec. 21.1-21.4 Graphical models, exponential families, and variational inference (optional) Stochastic Variational Inference (optional)	ps4 (data) due Nov. 17 at 5pm
8	Nov 4	Monte-Carlo methods for inference [Slides 1] [Slides 2] Importance sampling, Metropolis Hastings, Gibbs sampling	Sec. 23.1-23.4 Sec. 17.2.3 Sec. 24.1-24.4
9	Nov 11	More variational inference [Slides] TRW, loopy BP	Sec. 22.3-22.4
10	Nov 18	Discovering latent variables using the method-of-moments [Slides 1] [Slides 2] Guest lecture by Daniel Hsu	A Method of Moments for Mixture Models and Hidden Markov Models (Sec. 1 & 2) Method of Moments for Topic Modeling (optional)
11	Nov 25	Structured prediction [Slides]	Sec. 19.7 (also, review 19.6 again) Tutorial on Structured Prediction (optional) Original paper introducing structured perceptron (optional) Cutting-Plane Training of Structural SVMs (optional) Block-Coordinate Frank-Wolfe Optimization for Structural SVMs (optional)	ps5 (data) due Dec. 8 at 5pm
12	Dec 2	Learning Markov networks [Slides 1] [Slides 2] Pseudo-likelihood, deep generative models	Sec. 19.5, 26.7-26.8 Sec. 27.7, 28.2 Notes on pseudo-likelihood An Introduction to Conditional Random Fields (section 4; optional) Approximate maximum entropy learning in MRFs (optional)
13	Dec 9	Advanced topics in learning Probabilistic programming	Talk for recent ICML 2014 paper on deep generative models (optional) Probabilistic programming: overview and bibliography (optional) Probabilistic programming: BUGS (optional) Probabilistic programming: Church book (optional)
	Dec 16	Final exam (in class)

Acknowledgements: Many thanks to the Toyota Technological Institute, Hebrew University, UC Berkeley, and Stanford University for sharing material used in slides and homeworks

Reference materials

Optional course text: Probabilistic Graphical Models: Principles and Techniques by Daphne Koller and Nir Friedman, MIT Press (2009).
Machine learning books

Trevor Hastie, Rob Tibshirani, and Jerry Friedman, Elements of Statistical Learning, Second Edition, Springer, 2009. (Can be downloaded as PDF file.)
David Barber, Bayesian Reasoning and Machine Learning, Cambridge University Press, 2012. (Can be downloaded as PDF file.)

Probability

Chapter 2 of Murphy
Review notes from Stanford's machine learning class
Sam Roweis's probability review

Optimization

Convex Optimization by Stephen Boyd and Lieven Vandenberghe. (Can be downloaded as PDF file.)

Problem Set policy

I expect you to try solving each problem set on your own. However, when being stuck on a problem, I encourage you to collaborate with other students in the class, subject to the following rules:

You may discuss a problem with any student in this class, and work together on solving it. This can involve brainstorming and verbally discussing the problem, going together through possible solutions, but should not involve one student telling another a complete solution.
Once you solve the homework, you must write up your solutions on your own, without looking at other people's write-ups or giving your write-up to others.
In your solution for each problem, you must write down the names of any person with whom you discussed it. This will not affect your grade.
Do not consult solution manuals or other people's solutions from similar courses.

Late submission policy: During the semester you are allowed at most two extensions on the homework assignment. Each extension is for at most 48 hours and carries a penalty of 25% off your assignment.