Introduction To Machine Learning
Fall 2013
Overview Machine learning is an exciting and fastmoving field of
computer science with many recent consumer
applications (e.g., Microsoft Kinect, Google Translate,
Iphone's Siri, digital camera face detection, Netflix
recommendations, Google news) and applications within the
sciences and medicine (e.g., predicting proteinprotein
interactions, species modeling, detecting tumors,
personalized medicine). In this undergraduatelevel class,
students will learn about the theoretical foundations of
machine learning and how to apply machine learning to
solve new problems. 

General information Lectures: Tuesday and Thursday,
11am12:15pm
Office hours: Tuesday
56pm. Location: 715 Broadway, 12th floor, Room
1204 Grading: problem sets (50%) + midterm
exam (25%) + project (20%) + participation (5%). Problem Set policy Prerequisites: Basic Algorithms
(CS 310) is required, but can be taken concurrently.
Students should be very comfortable with basic
mathematical skills in addition to good programming
skills. Some
knowledge of linear algebra and multivariable calculus
will be
helpful.
Mailing list: To subscribe to the class list,
follow instructions here. 
Schedule
Note: the Bishop and
Murphy readings are optional
Lecture  Date  Topic  Required reading  Assignments 
1  Sept 3 (Tues) 
Overview [Slides]

Chapter
1 of Murphy's book Bishop, Chapter 1 (optional) 

2  Sept 5 (Th) 
Introduction to learning [Slides] Loss functions, Perceptron algorithm, proof of perceptron mistake bound 
Barber 17.12 (stop before 17.2.1) on leastsquares regression, 29.1.14 (review of vector algebra) Notes on perceptron mistake bound (just section 1) 
ps1 (data), due Sept 17 at 11am 
3 
Sept 10 (Tues) 
Linear classifiers [Slides] Introduction to Support vector machines 
Notes on support vector machines (sections 14) Bishop, Section 4.1.1 (pg. 181182) and Chapter 7 (pg. 325328) Murphy, Section 14.5.2 (pg. 498501) 

4 
Sept 12 (Th) 
Support vector machines [Slides] 
See above. Also: Bishop, Sections 7.1.1 and 7.1.3 

5 
Sept 17 (Tues) 
Support vector machines (continued) [Slides] Derivation of SVM dual, introduction to kernels 
Notes
on SVM dual and kernel methods (sec. 38) If you would like a second reference, see these notes (sections 58) Bishop, Section 6.2, Section 7.1 (except for 7.1.4), and Appendix E Murphy, Chapter 14 (except 14.4 and 14.7) 
ps2, due Sept 24 at 11am [Solutions] 
6 
Sept 19 (Th) 
Kernel methods [Slides] 
See above. Optional: For more on SVMs, see Hastie, Sections 12.112.3 (pg. 435). For more on crossvalidation see Hastie, Section 7.10 (pg. 250). Optional: For more advanced kernel methods, see chapter 3 of this book (free online from NYU libraries) 

7 
Sept 24 (Tues) 
Kernel methods & optimization Mercer's theorem, convexity 
Lecture
notes 

8 
Sept 26 (Th) 
Learning theory [Slides] Generalization of finite hypothesis spaces 
Lecture
notes These have only highlevel overviews:  Murphy, Section 6.5.4 (pg. 209)  Bishop, Section 7.1.5 (pg. 344) 
ps3 (data), due Oct. 8 at 11am 
9 
Oct 1 (Tues) 
Learning theory (continued) [Slides] VCdimension 
Notes
on learning theory 

10 
Oct 3 (Th) 
Learning theory (continued) [Slides] Also marginbased generalization 
Notes
on gaptolerant classifiers (section 7.1, pg. 2931) 

11 
Oct 8 (Tues) 
Nearest neighbor methods [Slides] 
Hastie et al., Sections 13.313.5 (on nearest
neighbor methods) Bishop, Section 14.4 (pg. 663) Murphy, Section 16.2 
ps4, due Oct. 17 at 11am [Solutions] 
12 
Oct 10 (Th) No class on Oct 15 (Fall recess) 
Decision trees [Slides] 
Mitchell Ch. 3 Rudin's lecture notes 

13 
Oct 17 (Th) 
Ensemble methods [Slides] Random forests 
Hastie et al., Section 8.7 (bagging) Optional: Hastie et al. Chapter 15 (on random forests) 

Oct 22 (Tues) 
Midterm exam

A
Few Useful Things to Know About Machine Learning 

14 
Oct 24 (Th) 
Clustering [Slides]
Kmeans 
Hastie et al., Sections 14.3.6,
14.3.8, 14.3.9, 14.3.12 Murphy, Sections 25.1, 25.525.5.3 Bishop, Section 9.1 (pg. 424) 
Project proposal, due Oct. 31 at 5pm by email 
15 
Oct 29 (Tues) 
Clustering (continued) [Slides]
Hierarchical clustering 
See above. 

16 
Oct 31 (Th) 
Clustering (continued) [Slides]
Spectral clustering 
Hastie et al., Section 14.5.3 Optional: Tutorial on spectral clustering Murphy, Section 25.4 

17 
Nov 5 (Tues) 
Introduction to Bayesian methods [Slides]
Probability, decision theory 
Murphy, Sections 33.3 Bishop, Sections 22.3.4 

18 
Nov 7 (Th) 
Naive Bayes [Slides] 
Murphy, Sections 3.4, 3.5 (naive Bayes), 5.7
(decision theory) Bishop, Section 1.5 (decision theory) 

19 
Nov 12 (Tues) 
Logistic regression [Slides] 
Notes
on naive Bayes and logistic regression Murphy, 88.3 (logistic reg.), 8.6 (generative vs. discriminative) Bishop, 4.24.3.4 (logistic reg.) 
ps5, due
Nov 21 at 11am [Solutions]

20 
Nov 14 (Th) 
Mixture models, EM algorithm
[Slides] 
Notes
on mixture models Notes on Expectation Maximization Murphy, 1111.4.2.5, Section 11.4.7 Bishop, Sections 9.2, 9.3, 9.4 

21 
Nov 19 (Tues) 
EM algorithm (continued) [Slides] 


22 
Nov 21 (Th) 
Hidden Markov models [Slides] 
Notes
on HMMs Tutorial on HMMs Murphy, Chapter 17 Bishop, Sections 8.4.1, 13.12 

23 
Nov 26 (Tues) 
Dimensionality reduction (PCA) [Slides] 
Notes
on PCA More notes on PCA Bishop, Sections 12.1 (PCA), 12.4.1 (ICA) Optional: Barber, Chapter 15 
ps6 (data), due Dec. 5 at 11am 
24 
Dec 3 (Tues) No class on Nov 28 (Thanksgiving) 
Bayesian networks
[Slides] Latent Dirichlet allocation 
Review
article on topic modeling Introduction to Bayesian networks 

25 
Dec 5 (Th) 
Collaborative filtering [Slides] 
Overview of matrix factorization


26 
Dec 10 (Tues) 
Applications in computational biology
[Slides] 
An introduction to graphical models 

Dec 12 (Th) 
Project presentations (group 1) 


Dec 17 (Tues) 1011:50am 
Project presentations (everyone else) During final exam slot. Note the special time! Same location. 

Reference materials

Problem
Set policy I expect you to try solving each problem set on your own. However, when being stuck on a problem, I encourage you to collaborate with other students in the class, subject to the following rules:
