Project information
Instructions for final project writeup
Suggested data sets (or better: find your own!):
NYC Open Data
Data.gov
UN Data
Kaggle
Face recognition, collaborative filtering, web ranking
(see bottom, under "Projects")
See
here
for more collaborative filtering data
20 Newsgroups
Blogs
(with spam labels)
Enron e-mail data set
(see also
here
)
Congress voting records
Twitter, Slashdot, etc.
NYTimes news articles
Useful links:
Python for data scientists
scikit.learn
: Python machine learning modules (very good!)
SVM
light
software (also very good)
Matlab/Octave resources
(see bottom of page)
Examples of
how to write up your project