Statistical Machine Translation Tutorial Reading
  I have seen this from Dr. David Kauchak’s webpage and think it worth to keep here, so I reprint here.
  The following is a list of papers that I think are worth reading for our discussion of machine translation. I’ve tried to give a short blurb about each of the papers to put them in context. I’ve included a number of papers that I marked “OPTIONAL” that I think are interesting, but are either supplementary or the material is more or less covered in the other papers.
  If anyone would like more information on a particular topic or would like to discuss any of these papers, feel free to e-mail me dkauchak
cs ucsd edu
Part 1 (Jan. 19)
A Statistical MT Tutorial Workbook. Kevin Knight. 1999.
  Very good introduction to word-based statistical machine translation.Written in an informal, understandable, tutorial oriented style.
  Kevin Knight said about the Knight99, it is very interesting below:
  “At the time, I was trying to align English sound sequences with Japanese sound sequences, and I knew that EM could do it for me. It was hard, though. I spent two years reading Brown et al 93. When I finally got it to work, I was pretty fired up, and I told David Yarowsky. He said, “EM is the answer to all the world’s problems.” Wow! I figured everybody should know about it, so I wrote “A Statistical MT Tutorial Workbook”.
Automating Knowledge Acquisition for Machine Translation.Kevin Knight. 1997.
  (OPTIONAL) Another tutorial oriented paper that steps through how one can learn from bilingual data. Also introduces a number of important concepts for MT.
Foundations of Statistical NLP,chapter 13. Manning and Schutze. 1999.
  (OPTIONAL) Must be accessed from UCSD. Overview of statistical MT. Spends a lot of time on sentence and word alignment of bilingual data.
Foundations of Statistical NLP, chapter 6. Manning and Schutze. 1999.
  (OPTIONAL) Must be accessed from UCSD. Discusses n-gram language modeling. Language modeling is crucial for SMT and many other natural language applications. I won’t spend much time discussing language modeling, but for those that are interested this is a good introduction.
Part 2 (Jan. 26)
Word models:
The Mathematics of Statistical Machine Translation: Parameter Estimation. P. F. Brown, S. A. Della Pietra, V. J. Della Pietra and R.L. Mercer. 1993.
  (OPTIONAL) All you ever wanted to know about word level models. Describes IBM models 1-5 and parameter estimation for these models. It’s about 50 pages and contains a lot of material for the interested reader.
Word model decoding:
Decoding Algorithm in Statistical Machine Translation.Ye-Yi Wand and Alex Waibel. 1997.
  Early paper discussing decoding of IBM model 2. The paper provides a fairly good introduction to word-level decoding including multi-stack search (i.e. multiple beams) and rest cost estimation (heuristic functions).
An Efficient A* Search Algorithm for Statistical Machine Translation.Franz Josef Och, Nicola Ueffing, Hermann Ney. 2001.
  (OPTIONAL) One of many papers on decoding with word-based SMT. They discuss the basic idea of viewing decoding as state space search and provide one method for doing this. They describe decoding for Model 3 and suggest a few different heuristics that are admissible, leading to few search errors.
Phrase based statistical MT:
   Statistical Phrase-Based Translation.Philipp Koehn, Franz Jasof Och and Daniel Marcu. 2003.
  Good, short overview of phrased based systems. If you want more details, see the paper below.
The Alignment Template Approach to Statistical Machine Translation.Franz Josef Och and Hermann Ney. 2004.
  (OPTIONAL) This is a journal paper discussing one phrase based statistical system including decoding. This is more or less the system used at ISI and is probably the best current system (though syntax based systems my beat these in the next few years). Requires acrobat 5 and to be at UCSD.
Part 3 (Feb. 2)
Phrase-based decoding:
  See the previous paper.
Syntax based translation:
What’s in a Translation Rule? Galley, Hopkins, Knight and Marcu. 2004.
  This is the current system being investigated at ISI and the hope is that these syntax based systems will perform better than phrase based systems.The paper is a bit tough to read since it’s a conference paper.
A Syntax-Based Statistical Translation Model. Yamada and Knight. 2001.
  (OPTIONAL) Predecessor model to Galley et al., but similar.
Syntax based decoding:
Foundations of Statistical NLP, chapter 12. Manning and Schutze. 1999.
  Must be on campus. This is a chapter on parsing (not actually decoding) However, since the above rules are very similar to PCFGs, then decoding is very similar to parsing… just with more complications.
A Decoder for Syntax-Based Statistical MT. Kenji Yamada and Kevin Knight. 2001.
  (OPTIONAL) Decoder for the above Yamada and Knight model.
Part 4 (Feb. 9)
Discriminative Training:
Discriminative Training and Maximum Entropy Models for Statistical Machine Translation.Och and Ney. 2002.
  Learning how the best models for combining the different models (traslation model, language model, etc.) using maximum entropy parameter estimation.This line of research is still very important and my be interesting to many of you since it’s very machine learningy.
Discriminative Reranking for Machine Translation. Shen, Sarkar and Och. 2004.
  (OPTIONAL) Given a ranked output of possible translations from the translation system, this paper uses the perceptron algorithm to learn a reranking of the sentences to improves the top translation.
MT Evaluation:
BLEU: A Method for Automatic Evaluation of Machine Translation. Papineni, Roukos, Ward and Zhu. 2001.
  Foundational method for evaluating MT methods and still used currently.
Related posts:
- Maximum Entropy Model Tutorial Reading
- Graphical Models and Bayesian Networks Tutorial Reading
- Bayesian Modeling for Language Tutorial Reading
- Moses Support Digest: Hierarchical and syntax-based decoding in Moses
- Moses Support Digest:Call for Participation ACL WMT 2010 Machine Translation Shared Task
- From nlpers:Getting Started In Summarization
- From nlpers:Getting Started in NLP
- Moses Support Digest:word lattice and multiple translation tables optimization problem
- Moses Support Digest:Code monkey available,Will work for peanuts
- Moses Support Digest:Translation from English to Foreign Language
[...] have found a good tutorial about Statistical Machine Translation from Dr. David Kauchak webpage and I love NLP weblog.The second one is a copy of the first [...]
Statistical Machine Translation Tutorial Reading « Multimedia annotation
10 Nov 09 at 11:25 pm