Discriminative Learning in Speech Recognition

  • Xiaodong He
  • Li Deng

MSR-TR-2007-129 |

In this paper, we study the objective functions of Maximum Mutual Information (MMI), Minimum Classification Error (MCE), and Minimum Phone/Word Error (MPE/MWE) for discriminative learning in speech recognition. We present an approach that unifies the objective functions of MMI, MCE and MPE/MWE in a common rational-function form. While the rational-function form of MMI has been known in the past, we provide a rigorous proof that the similar rational-function form exists for the objective functions of MCE and MPE/MWE. This allows the Growth Transformation (GT) or Extended Baum-Welch (EBW) based parameter optimization framework to be applied directly in discriminative learning. Prior to the current study, this framework was not directly applicable to MCE and MPE/MWE due to their lack of the appropriate rational-function form required by the GT/EBW-based parameter optimization method. In this paper, we include technical details on the derivation of the GT/EBW-based parameter optimization formulas for both discrete Hidden Markov Models (HMMs) and Continuous-Density HMMs (CDHMMs) in discriminative learning using MMI, MCE, and MPE/MWE criteria. For expository purposes, details on several related issues with practical significance are provided in Appendices.