Machine Learning and Computational Statistics
DS-GA-1003 and CSCI-GA.2567, Spring 2014
Overview Machine learning is an exciting and fast-moving field at
the intersection of computer science, statistics, and
optimization with many recent consumer
applications (e.g., Microsoft Kinect, Google Translate,
Iphone's Siri, digital camera face detection, Netflix
recommendations, Google news). Machine learning and
computational statistics also play a central role in data
science. In this graduate-level class, students will learn
about the theoretical foundations of machine learning and
computational statistics and how to apply these to solve
new problems. This is a required course for the MS in Data
Science and should be taken in the first year of study; it
is also suitable for MS and Ph.D. students in Computer
Science and related fields (see pre-requisites below). For registration information, please contact Varsha
Tiger <varsha.tiger@nyu.edu> or Katie Laugel
<laugel@cs.nyu.edu>. |
||||||||||||||||||||||||||||||
General information Lecture: Tuesdays, 5:10-7pm, in Warren
Weaver Hall 109.
Pre-requisites: There are two different sets of pre-requisites to accommodate both Computer Science and Data Science MS students. Students are required to have taken either:
Students should be familiar with linear algebra,
probability and statistics, and multi-variable calculus,
in addition to having good programming skills. Grading: problem sets (45%) + midterm
exam (25%) + project (25%) + participation (5%). Problem Set policy
Books: No textbook is required (readings will come from freely available online material). If an additional reference is desired, a good option is the following book by Kevin Murphy: Machine Learning: a Probabilistic Perspective (2012). A good reference on linear algebra and probability is Ernest Davis's Linear Algebra and Probability for Computer Science Applications. Mailing list: To subscribe to the class list,
follow instructions here. |
Schedule
Lecture
Date
Topic
Required reading
Assignments
1
Jan 28
Introduction to learning [Slides]
Chapter 1 of Murphy's book
Notes
on perceptron mistake bound (just section 1)
ps1 (data) due Feb 6 at 8pm.
2
Feb 4
Support vector machines (SVMs) [Slides]
Notes
on support vector machines
Optional: Second reference on SVM dual and kernel methods (sec. 3-8)
Optional: For more on SVMs, see Hastie, Sections 12.1-12.3 (pg. 435). For more on cross-validation see Hastie, Section 7.10 (pg. 250).
ps2 due Feb 14 at 5pm. [Solutions]
3
Feb 11
Kernel methods [Slides]
Optimization, Mercer's theorem
Notes on linear algebra, convexity, kernels, and Mercer's theorem
Optional: For more advanced kernel methods, see
chapter 3 of this book
(free online from NYU libraries)
ps3 (data) due Feb 25 at 3pm.
4
Feb 18
Learning theory [Slides]
Notes
on learning theory
Notes
on gap-tolerant classifiers (section 7.1, pg. 29-31)
Pedro Domingos's A
Few Useful Things to Know About Machine Learning
5
Feb 25
Decision trees [Slides]
Ensemble methods, Random forests
Mitchell Ch. 3
Hastie et al., Section 8.7 (bagging)
Optional: Rudin's lecture notes (on decision trees)
Optional: Hastie et al. Chapter 15 (on random
forests)
ps4 (data) due Mar 7 at 5pm.
6
March 4
Midterm review
Lab: deep learning (guest lecture by Yann LeCun)
7
March 11
(no class, office hours, or lab March 18/20, Spring break)
Midterm exam
Lab: project advisersProject proposal, due March 27 at 3pm.
8
March 25
Clustering [Slides]
K-means, hierarchical, spectral
Hastie et al., Sections 14.3.6,
14.3.8, 14.3.9, 14.3.12, 14.5.3
Optional: Tutorial
on spectral clustering
9
April 1
Dimensionality reduction [Slides]
Notes
on PCA
More
notes on PCA
Optional: Barber, Chapter 15
Optional: Roweis and Saul, Science 2000,
Tenenbaum et al., Science 2000,
van der Maaten and Hinton, JMLR '08
ps5 (data) due Apr 15 at 3pm.
10
April 8
Bayesian methods [Slides]
Maximum likelihood estimation, naive BayesNotes
on naive Bayes and logistic regression
Optional: Notes on probability and statistics
11
April 15
Graphical models [Slides]
Tutorial
on HMMs
Introduction
to Bayesian networks
ps6 due Apr 28 at 5pm
12
April 22
Unsupervised learning [Slides]
Notes
on mixture models
13
April 29
EM algorithm [Slides 1, Slides 2]
Mixture models, topic models, latent Dirichlet allocation Notes
on Expectation Maximization
The Expectation
Maximization Algorithm: A short tutorial
Review
article on topic modeling
Explore topic models of: state-of-the-union addresses, literary studies (see also this blog), evolution of science, Wikipedia
14
May 6
(no class Tuesday May 13)
Advanced topics
Optional:
Introduction to learning to rank
Joachims' Training Linear SVMs in Linear Time
Slides on collaborative filtering
Slides on victim identification using Bayesian networks (Video)
15
Thu. May 15, 7:10-9:40pm
Project presentations (WWH 13th floor)
Acknowledgements: Many thanks to the University of
Washington, Carnegie Mellon University, UT Dallas, Stanford, UC
Irvine, Princeton, and MIT for sharing material used in slides and
homeworks.
Reference materials
Problem
Set policy
I expect you to try solving each problem set on your own.
However, when being stuck on a problem, I encourage
you to collaborate with other students in the class, subject
to the following rules:
Late submission policy: During the
semester you are allowed at most two extensions on the
homework assignment. Each extension is for at most 48 hours
and carries a penalty of 25% off your assignment.