Improved Speech Modeling and Recognition Using Multi-Dimensional Articulatory States as Primitive Speech Units

  • Li Deng ,
  • J. Nu ,
  • H. Sameti

International Conference on Acoustics, Speech, and Signal Processing, ICASSP-95. |

Published by IEEE

Publication

We provide a formal description of a speech recognizer designed on the basis of elaborate articulatory timing that a asynchronous across the multiple articulatory-feature dimensions. Three improved critical components of the recognizer are described in detail. Evaluation results, obtained from a standard TIMIT phonetic recognition task confined within the N-best rescoring scenario, are reported on comparative performances between the new feature-based recognizer and a recognizer using the conventional context-dependent triphone units. The results demonstrate an overall superior quality of the rescored N-best list from the feature-based recognizer over that from the triphone-based recognizer. Greater performance improvements are observed as the top number of candidate sentences increases.