Coarticulation Modeling by Embedding a Target-Directed Hidden Trajectory Model into HMM-MAP Decoding and Evaluation

Frank Seide; Jian-Lai Zhou; Li Deng

Coarticulation Modeling by Embedding a Target-Directed Hidden Trajectory Model into HMM-MAP Decoding and Evaluation

Frank Seide ,
Jian-Lai Zhou ,
Li Deng

Proc. IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP) | January 2003

Download BibTex

The Hidden Dynamic Model (HDM) has been an attractive acous- tic modeling approach because it provides a computational model for coarticulation and the dynamics of human speech. However, the lack of a direct decoding algorithm has been a harrier to re- search progress on HDM. We have developed a new HDM-based acoustic model, the Hid- den-Trajectory HMM (HTHMM), which combines the statdmix- ture topology of a traditional monophone HMM with a target-di- rected hidden-trajectory model (a special form of HDM) for CO%- ticulation modeling. Because the classical Viterbi algorithm is not admissible, we have developed a novel MAP decoding algorithm for HTHMM that correctly takes the hidden continuous trajectory into account. This paper introduces our new HTHMM decoder that allows for the first time to evaluate an HDM-type model by direct decoding instead of N-best rescoring. Using direct decoding, we demon- strate that the coarticulatory mechanism of our HTHMM matches traditional contextdependent modeling (enumeration of model pa- rameters): The conlexr-independent HTHMM has slightly better accuracy than a crossword-triphone HMM on the Aurora2 task. The decoder also enables us to include state-boundary optimiza- tion into the HDMIHTHMM training procedure. This paper pre- sents the detailed decoding algorithm and evaluation results. while in [l] we present the HTHMM model itself and parameter training.