Abstract

For many automatic speech recognition (ASR) applications, it is useful to predict the likelihood that the recognized string contains an error. This paper explores two modifications of a classic design. First, it replaces the standard maximum likelihood classifier with a maximum entropy classifier. The maximum entropy framework carries the dual advantages discriminative training and reasonable generalization. Second, it includes a number of alternative features. Our ASR system is heavily pruned, and often produces recognition lattices with only a single path. These alternate features are meant to serve as a surrogate for the typical features that can be computed from a rich lattice. We show that the maximum entropy classifier easily outperforms the standard baseline system, and the alternative features provide consistent gains for all of our test sets.