Fast and Accurate Arc Filtering for Dependency Parsing

  • Colin Cherry | National Research Council Canada

Graph-based dependency parsing finds direct syntactic relationships between words in a sentence by connecting head-modifier pairs into a tree structure. We propose a series of learned arc filters to speed up this process. A cascade of filters identify implausible head-modifier pairs, with time complexity that is first linear, and then quadratic in the length of the sentence. The linear filters reliably predict, in context, words that are roots or leaves of dependency trees, and words that are likely to have heads on their left or right. We use this information to quickly prune arcs from the dependency graph. More than 78% of total arcs are pruned while retaining 99.5% of the true dependencies. These filters improve the speed of two state-of-the-art dependency parsers, with low overhead and negligible loss in accuracy.
We also present ongoing work, where we attempt to improve the performance of overlapping linear filters with joint training, where filter interactions are handled using latent variables.

Speaker Details

Colin Cherry is a Research Officer in the Institute for Information Technology at the National Research Council Canada. He received his doctorate from the University of Alberta, where he studied under Dekang Lin. Before coming to the NRC, he worked as a Researcher in Microsoft Research’s natural language processing group. He is interested in predicting structured outputs, with application to parsing, morphology and machine translation.

    • Portrait of Colin Cherry

      Colin Cherry

    • Portrait of Jeff Running

      Jeff Running