Portrait of Jinyu Li

Jinyu Li

Principal Applied Scientist


Jinyu Li received the Ph.D. degree from Georgia Institute of Technology, Atlanta, in 2008. From 2000 to 2003, he was a Researcher in the Intel China Research Center and Research Manager in iFlytek Speech, China. Currently, he is a Principal Applied Scientist and Technical Lead in Microsoft Corporation, Redmond, WA. He leads a team to design and improve speech modeling algorithms and technologies that ensure industry state-of-the-art speech recognition accuracy for Microsoft products such as Cortana and xBox Kinect. His major research interests cover several topics in speech recognition, including deep learning, noise robustness, discriminative training, feature extraction, and machine learning methods. He is the leading author of the book “Robust Automatic Speech Recognition — A Bridge to Practical Applications”, Academic Press, Oct, 2015. Currently, he is the member of IEEE Speech and Language Processing Technical Committee. He also serves as the associate editor of IEEE/ACM Transactions on Audio, Speech and Language Processing.




New book “Robust Automatic Speech Recognition: A Bridge to Practical Applications” is published!

The first book that provides a comprehensive review on noise and reverberation robust speech recognition methods in the era of deep neural networks.

Connects robust speech recognition techniques to machine learning paradigms with rigorous mathematical treatment.

Provides elegant and structural ways to categorize and analyze noise-robust speech recognition techniques.

Book available from



Honors and patents

Selected Honors

  • Microsoft research technical transfer award, 2014
  • Microsoft patent awards, 2008-present
  • Interspeech 2006 Best Student Paper
  • Colonel Oscar P. Cleaver Award (for the highest score on the Ph.D. preliminary exam in ECE, Georgia Institute of Technology, 2004)
  • Guo Moruo Scholarship (the highest honor in USTC)


· Learning Student DNN via Output Distribution
· Variable-Component Deep Neural Network For Robust Speech Recognition
· Shared Hidden Layer Combination for Speech Recognition Systems
· Low-footprint Adaptation and Personalization for Deep Neural Network
· Restructuring deep neural network acoustic models
· Exploiting heterogeneous data in deep neural network based speech recognition systems
· Efficient implementation of posterior-based feature with partial distance elimination
· Multilingual Deep Neural Network
· Utilizing Scalar Operations For Recognizing Utterances During Automatic Speech Recognition In Noisy Environments
· Online distorted speech estimation within an unscented transformation framework
· Confidence Calibration in Automatic Speech Recognition Systems
· Model training for automatic speech recognition from imperfect transcription data
· Phase sensitive model adaptation for noisy speech recognition
· Adapting a compressed model for use in speech recognition
· high-performance hmm adaptation with joint compensation of additive and convolutive distortions via vector Taylor series


Thinking outside-of-the-black-box of machine learning on the long quest to perfecting automatic speech recognition

Français du Canada English