Abstract

We present in this paper an integrated view on the speech preprocessing and speech modeling problems in the design of a hidden Markov model (HMM) based speech recognizer. The integrated model we developed in this study generalizes the conventional, currently widely used delta-parameter technique, which has been confined strictly to the preprocessing domain only, in two significant ways. First, the new model contains state-dependent weighting functions responsible for transforming static speech features into the dynamic ones in a slowly time-varying manner. Second, a novel maximum-likelihood based learning algorithm is developed for the model that allows joint optimization of the state-dependent weighting functions and the remaining conventional HMM parameters. The experimental results obtained from a standard TIMIT phonetic classification task provide preliminary evidence for the effectiveness of our new, general approach to the use of the dynamic characteristics of speech spectra. The results demonstrate that the new approach is most effective for discrimination of stop consonants exhibiting the fastest and most conspicuous dynamic patterns.

‚Äč