Reading Group
Since there is increasing interest in statistical machine translation
(SMT) around the world and at MIT, our reading group meets to
review research and discuss new ideas.
If you have any questions, please send me email.
Sign up on the mailing list!
Please volunteer!
It would be great, if you could volunteer to present a paper or even
your own research.
Schedule
Thursday, November 18, 2pm, Room 32-G451 | Presenter: David Kauchak Papers: "Robust Sub-Sentential Alignment of Phrase-Structure Trees", Declan Groves, Mary Hearne and Andy Way (Dublin City University) | |
Thursday, November 4, 2pm, Room 32-G451 | Presenter: Philipp Koehn Papers: "Example-based Machine Translation Based on Syntactic Transfer with Statistical Models", Kenji Imamura, Hideo Okuma, Taro Watanabe, and Eiichiro Sumita (ATR); "Hierarchical Phrase Alignment Harmonized with Parsing", Kenji Imamura (ATR) | pdf |
Thursday, October 28, 2pm, Room 32-G451 | Presenter: Luke Zettlemoyer Paper: "Syntax-Based Alignment: Supervised or Unsupervised", Hao Zhang and Daniel Gildea (Univ. Rochester) | |
Thursday, October 21, 2pm, Room 32-G451 | Presenter: Philipp Koehn Paper: "A Path-Based Transfer Model for Machine Translation", Dekang Lin (Univ. Alberta) | |
Thursday, July 15, 2pm, Room 32-261 | Presenter: Philipp Koehn Paper: "Improving a Statistical MT System with Automatically Learned Rewrite Patterns", Fei Xia and Michael McCord (IBM) | |
Thursday, July 8, 2pm, Room 32-261 | Presenter: Brooke Cowan Paper: "Greedy Decoding for Statistical Machine Translation in Almost Linear Time", Ulrich Germann (ISI) | |
Thursday, July 1, 2pm, Room 32-261 | Presenter: Philipp Koehn I will share some impressions from the DARPA MT Eval Workshop, which took place last week in Washington, DC. | - |
Thursday, June 10, 2pm, Room 32-G451 | Presenter: Philipp Koehn Paper: Discriminative Reranking for Statistical Machine Translation Shen, Sarkar, and Och | - |
Thursday, May 14, 2pm, Room 32-G451 | Presenter: Philipp Koehn Let's meet tomorrow and chat about the new papers in statistical MT presented at HLT-NAACL. I will also share some lessons from this year's DARPA MT Eval, which is going on right now. | - |
Thursday, April 29, 2pm, Room 32-261 | Presenter: Philipp Koehn Paper: "Minimum Error Rate Training in Statistical Machine Translation", Franz Och (ISI) | |
Thursday, April 22, 2pm, Room 32-261 | Presenter: Michael Collins Paper: "Head Automata and Bilingual Tiling: Translation with Minimal Representations", Hiyan Alshawi (AT&T) | |
Thursday, April 15, 2pm, Room 32-261 | Presenter: Philipp Koehn Tutorial: "My statistical machine translation system: A look under the hood" Everbody should have a rough understanding of the phrase translation model that I discussed in the tutorial. Check my NAACL paper for some background. Tomorrow, I will open the hood and offer a look into the inner workings, the data structure files, etc. At the end of the day, you will be able to train and run your own statistical machine translation system. |
paper handout manual |
Thursday, April 8, 3pm, Room 32-261 | Presenter: Luke Zettlemoyer Paper: "What's in a translation rule?", Michael Galley (Columbia), Mark Hopkins (UCLA), Kevin Knight and Daniel Marcu (USC) Abstract: We propose a theory that gives formal semantics to word-level alignments defined over parallel corpora. We use our theory to introduce a linear algorithm that can be used to derive from word-aligned, parallel corpora the minimal set of syntactically motivated transformation rules that explain human translation data. |
|
Thursday, April 1, 3pm, Room 32-261 | Presenter: David D. Palmer, Virage Advanced Technology Group Talk: "Statistical Machine Translation in Real-time Multilingual Video Processing" Recent advances in statistical machine translation approaches have significantly improved the speed of MT systems and the readability of their output. These advances have enabled the integration of Statistical MT into large-scale language processing environments. I will discuss and demonstrate the use of MT in a fully-automated real-time broadcast news video and audio processing system. The system combines speech recognition, statistical machine translation, and cross-lingual information retrieval components to enable real-time search and alerting from live English, Arabic, and Mandarin news sources. |
- |
Thursday, March 18, 3pm, 8th floor playroom | Presenter: Philipp Koehn Tutorial: "Introduction to Statistical Machine Translation", part 3 This Thursday we will finish up the tutorial, which will be about the latest developments and ideas in statistical machine translation: the currently best-performing method called phrase-based MT, efforts to make use of syntax and discriminative training. |
- |
Wednesday, March 10, 4pm, 8th floor playroom | Presenter: Philipp Koehn Tutorial: "Introduction to Statistical Machine Translation", part 2 I will cover in more detail the EM algorithm, generative models such as IBM Model 4. |
ps |
Thursday, February 26, 4pm, 8th floor playroom | Presenter: Philipp Koehn Tutorial: "Introduction to Statistical Machine Translation", part 1 As an introduction Philipp Koehn will be going over a tutorial on SMT that he presented last year together with Kevin Knight at HLT/NAACL and the MT SUMMIT conferences. This will provide a gentle introduction to the state of the art. |
ps |