Your project group must consist of exactly 2 people.
A 2-page PDF of your project proposal (one per 2-person group) is due by Thursday Mar. 27 at 3pm and should be e-mailed to Akshay Kumar (akshaykumar {@ | at} nyu.edu). The proposal will then be given to your assigned project adviser who will send you feedback ~1 week later. Please note that your project proposal will factor into your overall project grade, so make sure that it is written well and follows the below guidelines.
I have posted a few links to a few publicly available data sets here (by no means comprehensive!), and may post more over the next few weeks.
You are strongly encouraged to think out of the box and think of new problems that you can tackle using machine learning, and where you can get data from. Also, feel free to use the course mailing list to discuss ideas or send pointers to data sets that you think might be interesting for other students.
Your project proposal must detail the data that you plan to use, how you will pre-process it, and a precise plan of action, including what questions you would like to ask/problems to solve, machine learning algorithm(s) you hope to apply, how you will perform your evaluation (e.g., for supervised prediction you might use cross validation, looking at accuracy; then you might analyze your false positives/negatives to understand where and why the algorithms succeed/fail), a timeline for your work, and an explanation of what you expect to learn from your project. I strongly encourage you to download the data and explore it carefully prior to submitting your project proposal.
This is meant to be open ended, and I don't expect any two projects to be similar. The goal here is for you to spend time thinking deeply about machine learning. To give you an idea of the scope, I am expecting you to spend ~40 hours (per person) between now and the end of the semester on the project. Do not forget that you will also have 2--3 more homework assignments and so you'll have to split your time between the project and regular homework.