Faster Decoding with Synchronous Grammars and n-gram Language Models

December 11, 2006
Liang Huang | University of Pennsylvania

Faster Decoding with Synchronous Grammars and n-gram Language Models (and how these techniques can be applied back to parsing)

Joint work with David Chiang (USC/ISI).

A major obstacle in syntax-based machine translation is the prohibitively large search space for decoding with an integrated language model. We develop faster approaches for this problem based on lazy algorithms for k-best parsing. When comparing against Chiang’s technique of cube pruning, our method runs up to twice as fast without making more search errors or decreasing translation accuracy as measured by BLEU. We demonstrate the effectiveness of the algorithm on a large-scale translation system.

Interestingly, these techniques can be applied to speed up bilexical parsing as well, where the (bi-) lexical probabilities can be viewed as n-gram probabilities that causes non-monotonicity. This method fits naturally into the coarse-to-fine grained multi-pass parsing schemes.

To push this direction even further, we can generalize cube and lazy cube pruning as generic tools for reducing complicated search spaces, as alternatives to the well-known A* and annealing techniques.

Speaker Details

Liang Huang is a PhD student at the University of Pennsylvania, under Aravind Joshi. He is mainly interested in algorithmic problems in computational linguistics, esp. parsing and syntax-based machine translation. His most recent work has been on efficient integration of n-gram models for syntax-based decoding while visiting USC/ISI under Kevin Knight. He also works on applying dynamic programming and formal grammars to structural biology, in collaboration with Ken Dill of UCSF.http://www.cis.upenn.edu/~lhuang3/