Tone Articulation Modeling for Mandarin Spontaneous Speech Recognition

  • Jian-lai Zhou ,
  • Ye Tian ,
  • ,
  • Chao Huang ,
  • Eric Chang

IEEE |

Published by Institute of Electrical and Electronics Engineers, Inc.

Tone modeling is an unavoidable problem in Mandarin speech recognition. In continuous speech, the pitch contour exhibits variable patterns, and it is strongly influenced by its tone context. Although several effective methods have been proposed to improve the accuracy for tonal syllables in Mandarin continuous speech recognition, many recognition errors are caused by poor tone discrimination capability of the acoustic model. Furthermore, the case becomes worse for the recognition of spontaneous speech. In this paper, we report our work on tone articulation modeling. Tone context dependent models are used to model unstable pitch patterns caused by co-articulation in continuous speech. Corresponding acoustic features are investigated as well. Our methods are evaluated on two test sets: one is reading-style speech data, the other is spontaneous. The experimental results show that for the test set of casual speech, the proposed method turns out to be more effective than tone context independent model, while they are comparable for the test set of reading-style speech. Several factors which have potential to improve the proposed method are discussed in the final part in this paper.