Discriminative Learning and Spanning Tree Algorithms for Dependency Parsing


October 4, 2005


Ryan McDonald


University of Pennsylvania


In recent years discriminative learning techniques have seen a surge of interest in the NLP community due their ability to tractably incorporate millions of dependent and linguistically rich features. In many fields, most notably information extraction, discriminative models have become the standard. In this talk I will describe a generalization of the multi-class online large-margin algorithms of Crammer and Singer (2003) to structured outputs. I apply this learning framework to the problem of extracting dependency tree representations of sentences in conjunction with a spanning tree (maximum branching) parsing framework that leads to efficient algorithms for projective and non-projective structures. I show that parsers trained under this framework can achieve state-of-the-art accuracies when combined with a rich feature set. Further more I will describe experiments displaying that these parsers are naturally extendable and can be adapted to new domains through additional features defined from information from in and out-of-domain classifiers.


Ryan McDonald

Ryan McDonald received his BSc from the University of Toronto and is currently completing his PhD at the University of Pennsylvania under the supervision of Fernando Pereira. His main area of research involves developing and applying machine learning techniques to structured natural language processing problems for which standard dynamic programming solutions are intractable, including non-nested syntactic tree representations, discontinuous segmentations of text and complex relation extraction. Web site: http://www.cis.upenn.edu/~ryantm