Maximal Lattice Overlap in Example-Based Machine Translation

  • Rebecca Hutchinson ,
  • Paul N. Bennett ,
  • Jaime Carbonell ,
  • Peter Jansen ,
  • Ralf Brown

CMU-LTI-03-174 |

CMU-CS-03-138, Computer Science Department, School of Computer Science, Carnegie Mellon University (Shares significant overlap in content with publication listing for MT Summit IX, 2003.)

Example-Based Machine Translation (EBMT) retrieves pre-translated phrases from a sentence-aligned bilingual training corpus to translate new input sentences. EBMT uses long pre-translated phrases effectively but is subject to disfluencies at phrasal translation boundaries. We address this problem by introducing a novel method that exploits overlapping phrasal translations and the increased confidence in translation accuracy they imply. We specify an efficient algorithm for producing translations using overlap. Finally, our empirical analysis indicates that this approach produces higher quality translations than the standard method of EBMT in a peak-to-peak comparison.