Efficient Inference of CRFs for Large-Scale Natural Language Data

The Conference on Empirical Methods in Natural Language Processing (EMNLP 2009) |

Published by Association for Computational Linguistics

PDF | PDF | Publication | Publication | Publication

This paper presents an efficient inference algorithm of conditional random fields (CRFs) for large-scale data. Our key idea is to decompose the output label state into an active set and an inactive set in which most unsupported transitions become a constant. Our method unifies two previous methods for efficient inference of CRFs, and also derives a simple but robust special case that performs faster than exact inference when the active sets are sufficiently small. We demonstrate that our method achieves dramatic speedup on six standard natural language processing problems.