Inversion Transduction Grammar with Linguistic Constraints


November 30, 2006


Colin Cherry


University of Alberta


Bilingual word alignment, the task of finding word-to-word connections between a sentence and its translation, is an important part of knowledge acquisition for statistical machine translation. An Inversion Transduction Grammar, or ITG, provides an efficient algorithm to align a bilingual sentence pair, by simultaneously parsing the two sentences. However, the simple bracketing grammar usually employed in ITG parsing has no linguistic content. We investigate two methods to inform the ITG parser with the phrase structure from a linguistically-motivated dependency tree: a phrasal cohesion constraint on a simple ITG aligner, and a complete discriminative ITG parser that uses cohesion with the dependency tree as a soft constraint. This final system not only recovers links lost by the hard constraint, but also improves link recall beyond a completely unconstrained system.


Colin Cherry

Colin Cherry is a PhD student at the University of Alberta. He received his undergraduate degree from Acadia University. His graduate research focuses on machine learning methods and syntactic extensions for bilingual word alignment. Colin works with Dekang Lin and is supported by an NSERC Canada Graduate Scholarship and an Alberta Ingenuity Studentship.