Reducing Boundary Friction Using Translation-Fragment Overlap

  • Ralf D. Brown ,
  • Rebecca Hutchinson ,
  • Paul N. Bennett ,
  • Jaime G. Carbonell ,
  • Peter Jansen

Proceedings of the Machine Translation Summit IX (Shares significant overlap in content with publication listing for CMU-CS-03-138, 2003.) |

Many corpus-based Machine Translation (MT) systems generate a number of partial translations which are then pieced together rather than immediately producing one overall translation. While this makes them more robust to ill-formed input, they are subject to disfluencies at phrasal translation boundaries even for well-formed input. We address this “boundary friction” problem by introducing a method that exploits overlapping phrasal translations and the increased confidence in translation accuracy they imply. We specify an efficient algorithm for producing translations using overlap. Finally, our empirical analysis indicates that this approach produces higher quality translations than the standard method of combining non-overlapping fragments generated by our Example-Based MT (EBMT) system in a peak-to-peak comparison.