Joint work with David Chiang (University of Maryland)
K-best parsing (and k-best processing in general) has become a popular technique in natural language processing. However, fast and exact k-best algorithms are largely unknown to the parsing community. In this work, we develop efficient algorithms for exact k-best derivation trees in the framework of directed monotonic hypergraphs. To demonstrate the efficiency, scalability and accuracy of these algorithms, we present experiments on Bikel’s implementation of Collins’ lexicalized PCFG model, and on Chiang’s CFG-based decoder for hierarchical phrase-based translation. We show in particular how the improved output of our algorithms has the potential to improve results from parse reranking systems and other applications.
These algorithms have been re-implemented by other researchers in the field, including Eugene Charniak for his n-best parser, Ryan McDonald and Microsoft Research for dependency parsers, and the USC/ISI team for the CKY-based decoder in their syntax-based machine translation system. All of these experiments confirmed the findings in our work.
Liang Huang and David Chiang (2005). Better k-best Parsing. Proceedings of the 9th International Workshop on Parsing Technologies (IWPT). http://www.cis.upenn.edu/~lhuang3/huang-iwpt.pdf