Portrait of Yan Huang

Yan Huang

Senior Applied Scientist

About

Yan’s research interests include machine learning and its applications in speech and natural language processing. Since she joined Microsoft in 2010, she has been working on improving acoustic model accuracy and robustness for Microsoft speech product. Her current research topics include large-scale semi-supervised deep learning acoustic model, improving acoustic model robustness with respect to phonetic and non-phonetic acoustic variability, model adaptation, and far-talk speech recognition.

Before joining Microsoft, she was with center of speech and language processing (CLSP) in Johns Hopkins University and International Computer Science Institute (ICSI) in UC Berkeley.

Publications

Other

· Qi Li and Yan Huang, “An auditory based-based feature extraction algorithm for robust speaker identification under mismatched conditions”, IEEE Trans. on Speech and Audio Processing, vol. 19, no. 6, pp. 1791-1801, August 2011

· Qi Li and Yan Huang, “Robust speaker identification using an auditory-based feature”, in the Proceedings of the 2010 IEEE International Conference on Acoustics, Speech, and Signal Processing, Dallas, 2010.

· Gerald Friedland, Oriol Vinyals, Yan Huang, and Christian Müller, “Prosodic and other Long-Term Features for Speaker Diarization, IEEE Transactions on Audio, Speech, and Language Processing, Vol 17, No 5, pp 985–993, July 2009

· Gerald Friedland, Oriol Vinyals, Yan Huang and Christian Müller, “Fusing short term and long term features for improved speaker diarization,” in the Proceedings of the 2009 IEEE International Conference on Speech and Signal Processing, Taipei, 2009

· Hayley Hung, Yan Huang, Gerald Friedland, Daniel Gatica-Perez, “Estimating Dominance in Multi-Party Conversations Using Automatically Generated Audio Cues,” IEEE Transaction of audio, speech and language processing, 2009

· Hayley Hung, Yan Huang, Gerald Friedland, and Daniel Gatica-Perez, “Correlating audio-visual cues in a dominance estimation framework,” in the Proceedings of the 2009 CVPR workshop on human behavior, 2009

· Hayley Hung, Yan Huang, Gerald Friedland, and Daniel Gatica-Perez, “Estimating the Dominant Person in Multi-Party Conversations Using Speaker Diarization Strategies,” in the Proceedings of the 2008 interspeech, Las Vegas, 2008

· Yan Huang, Oriol Vinyals, Gerald Friedland, Christian Müller, Nikki Mirghafori, Chuck Wooters, “A fast-match approach for robust, faster than real-time speaker diarization”, in the Proceedings of the 2007 IEEE Automatic Speech Recognition and Understanding Workshop, 2007

· Michael Pucher and Yan Huang, “Optimization of Latent Semantic Analysis Based Language Model Interpolation for Meeting Recognition”, Fifth Slovenian and First International Language Technologies Conference, Slovenia, 2006

· Brigitte Bigi, Yan Huang, and Renato De Mori “Vocabulary and Language Model Adaptation using Information Retrieval”, in the Proceedings of the International Conference on Spoken Language Processing, Jeju Island, Korea, 2004

· Yan Huang and Bo Xu, “A Novel Model TD-PSPTP for Speech Synthesis”, in the Proceedings of the 6th European Conference on Speech Communication and Technology, Budapest, Hungary, 1999

· Yan Huang and Taiyi Huang, “Neural Learning Approach for Duration Parameter Generation in Mandarin Speech Synthesis”, in the Proceedings of the 1th International Symposium on Chinese Spoken Language Processing, Singapore, 1998