Discriminative pronunciation learning using phonetic decoder and minimum classification error criterion

Oriol Vinyals; Li Deng; Dong Yu; Alex Acero

Discriminative pronunciation learning using phonetic decoder and minimum classification error criterion

Oriol Vinyals ,
Li Deng ,
Dong Yu ,
Alex Acero

Proceedings of the ICASSP | April 2009

Published by Institute of Electrical and Electronics Engineers, Inc.

Download BibTex

In this paper, we report our recent research aimed at improving the pronunciation-modeling component of a speech recognition system designed for mobile voice search. Our new discriminative learning technique overcomes the limitation of the traditional ways of introducing alternative pronunciations that often enlarge confusability across different lexical items. Instead, we make use of a phonetic recognizer to generate pronunciation candidates, which are then evaluated and selected using the global minimum-classification error measure, guaranteeing a reduction of the training-set error rate after introducing alternative pronunciations. A maximum entropy approach is subsequently used to learn the weight parameters of the selected pronunciation candidates. Our experimental results demonstrate the effectiveness of the discriminative pronunciation learning technique in a real-world speech recognition task where pronunciation of business names presents special difficulty for high-accuracy speech recognition.

© 2007 IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE.