6.864: General Information
Tahira Naseem 32-G362 (Office Hours: Wed 1:00-2:00)
Time & Location:
Tues & Thurs 1-2.30, 32-144
Tues 2:30-3:30 or by appointment.
6.864 is a graduate introduction to natural language processing, the study of human language from a computational perspective. We will cover syntactic, semantic and discourse processing models. The emphasis will be on machine learning or corpus-based methods and algorithms. We will describe the use of these methods and models in applications including syntactic parsing, information extraction, statistical machine translation, dialogue systems, and summarization.
This subject qualifies as an Artificial Intelligence and Applications
- Introduction (1 lecture)
- Estimation techniques, and language modeling (2 lecture)
- Stochastic tagging, and log-linear models (2 lectures)
- The EM algorithm in NLP (2 lectures)
- Morphology (1 lectures)
- Parsing and Syntax (2 lectures)
- Unsupervised grammar induction (1 lectures)
- Machine Translation (4 lectures)
- Desciphering (1 lecture)
- Probabilistic similarity measures and clustering (2 lectures)
- Word Sense Dismabiguation (1 lecture)
- Discourse Processing (3 lectures)
- Information Retrieval (2 lectures)
Course readings will be available either on the web or in-class
The optional textbooks for this course are:
There were will be 4 problem sets. The problem sets will include both
theoretical problems and some programming assignments.
There will be a final project for the class.
- Homeworks (30%), two midterms (40%), and project (30%).
- First midterm is scheduled on 21st October 2010.