Generating Case Markers in Machine Translation

  • Kristina Toutanova
  • Hisami Suzuki

Proceedings of NAACL |

Published by Association for Computational Linguistics

We study the use of rich syntax-based statistical models for generating grammatical case for the purpose of machine translation from a language which does not indicate case explicitly (English) to a language with a rich system of surface case markers (Japanese). We propose an extension of n-best re-ranking as a method of integrating such models into a statistical MT system and show that this method substantially outperforms standard n-best re-ranking. Our best performing model achieves a statistically significant improvement over the baseline MT system according to the BLEU metric. Human evaluation also confirms the results.