Abstract

We present in this paper an overview of the Hidden Dynamic Model (HDM) paradigm, exemplifying parametric construction of structure-based speech models that can be used for recognition purposes. We explore a general class of the HDM that uses recursive, autoregression functions to represent the hidden speech dynamics, and uses neural networks to represent the functional relationship between the hidden and observed speech vectors. This type of state-space formulation of the HDM is reviewed in terms of model construction, a parameter estimation technique, and a decoding method. We also present some typical experimental results on the use of this type of HDMs for phonetic recognition and for automatic vocal tract resonance tracking. We further provide analyses on the computational complexity (for decoding) and the parameter size of the HDM in comparison with the HMM. Finally, we discuss several key issues related to future exploration of the HDM paradigm.
‚Äč