6.864: About |
 |
|
|
Instructors:
Regina Barzilay,
Michael Collins,
Time & Location:
Tues & Thurs 1-2.30, 32-155
Office Hours:
By appointment
Course Description:
Graduate introduction to natural language processing, the study of human
language from a computational perspective. Syntactic, semantic and discourse
processing models. Emphasis on machine learning or corpus-based methods and
algorithms. Use of these methods and models in applications including syntactic
parsing, information extraction, statistical machine translation, dialogue
systems, and summarization.
This subject qualifies as an Artificial Intelligence and Applications
concentration subject.
Syllabus:
- Introduction (1 lecture)
- Estimation techniques, and language modeling (1 lecture)
- Parsing and Syntax (5 lectures)
- The EM algorithm in NLP (1 lecture)
- Stochastic tagging, and log-linear models (2 lectures)
- Probabilistic similarity measures and clustering (2 lectures)
- Machine Translation (2 lectures)
- Discourse Processing: segmentation, anaphora resolution (3 lectures)
- Dialogue systems (1 lectures)
- Natural Language Generation/Summarization (1 lecture)
- Unsupervised methods in NLP (1 lecture)
Readings:
Course readings will be available either on the web or in-class
handouts.
Academic Integrity:
Everything you do for credit in this subject is supposed to be your own work.
You can talk to other students (and instructors) about approaches to problems, but then
you should sit down and do the problem yourself. This is not only the ethical
way but also the only effective way of learning the material.
Objectives:
Upon completion of 6.864, students will be able to explain and apply
fundamental algorithms and techniques in the area of natural language
processing (NLP). In particular, students will:
- Understand approaches to syntax and semantics in NLP.
- Understand approaches to discourse, generation, dialogue and
summarization within NLP.
- Understand current methods for statistical approaches to machine
translation.
- Understand machine learning techniques used in NLP, including hidden
markov models and probabilistic context-free grammars, clustering and
unsupervised methods, log-linear and discriminative models, and the EM
algorithm as applied within NLP.
Measurable Outcomes and Assessment Methods
Students completing 6.864 will have demonstrated an ability to:
- Understand the mathematical and linguistic foundations underlying
approaches to the above areas in NLP (measured by problem sets and
quizes).
- Design, implement and test algorithms for NLP problems (measured by
problem sets).
Evaluation: Midterm (20%), final exam (30%) and five homeworks (50\%).