Text to Speech

Established: November 1, 2018

We are working on neural network based text to speech (TTS). including acoustic model, vocoder, frontend, and end-to-end text-to-wave model. Our research works have been transferred in Microsoft Azure TTS service to improve the product experiences.

Product Transfer (Azure TTS page: https://azure.microsoft.com/en-us/services/cognitive-services/text-to-speech/)

 • Our FastSpeech has supported more than 70 languages in Microsoft Azure Text to Speech Service!  [News-1] [News-2]
 • Our LRSpeech helps Azure TTS to extend 5 new low-resource languages! [News]
 • Our AdaSpeech has been deployed in Microsoft Azure TTS to support custom voice.

Paper Publication (Speech demo page: https://speechresearch.github.io/)

 • Jiawei Chen, Xu Tan, Yichong Leng, Jin Xu, Guihua Wen, Tao Qin, Tie-Yan Liu, Speech-T: Transducer for Text to Speech and Beyond, NeurIPS, 2021.
 • Xu Tan, Tao Qin, Frank Soong, Tie-Yan Liu, A Survey on Neural Speech Synthesis, arXiv 2021. [Paper] [Article-1] [Article-2] [Github]
 • Sang-gil Lee, Heeseung Kim, Chaehun Shin, Xu Tan, Chang Liu, Qi Meng, Tao Qin, Wei Chen, Sungroh Yoon, Tie-Yan Liu, PriorGrad: Improving Conditional Denoising Diffusion Models with Data-Driven Adaptive Prior, arXiv 2021. [Paper]
 • Yuzi Yan, Xu Tan, Bohan Li, Guangyan Zhang, Tao Qin, Sheng Zhao, Yuan Shen, Wei-Qiang Zhang, Tie-Yan Liu, AdaSpeech 3: Adaptive Text to Speech for Spontaneous StyleINTERSPEECH 2021.
 • Yuzi Yan, Xu Tan, Bohan Li, Tao Qin, Sheng Zhao, Yuan Shen, Tie-Yan Liu, AdaSpeech 2: Adaptive Text to Speech with Untranscribed Data. ICASSP 2021. [Paper]
 • Renqian Luo, Xu Tan, Rui Wang, Tao Qin, Jinzhu Li, Sheng Zhao, Enhong Chen, Tie-Yan Liu, LightSpeech: Lightweight and Fast Text to Speech with Neural Architecture Search. ICASSP 2021. [Paper]
 • Chen Zhang, Yi Ren, Xu Tan, Jinglin Liu, Kejun Zhang, Tao Qin, Sheng Zhao, Tie-Yan Liu, DenoiSpeech: Denoising Text to Speech with Frame-Level Noise Modeling. ICASSP 2021. [Paper]
 • Yichong Leng, Xu Tan, Sheng Zhao, Frank Soong, Xiang-Yang Li, Tao Qin, MBNet: MOS Prediction for Synthesized Speech with Mean-Bias Network, ICASSP 2021. [Paper]
 • Mingjian Chen, Xu Tan, Bohan Li, Yanqing Liu, Tao Qin, Sheng Zhao, Tie-Yan Liu, AdaSpeech: Adaptive Text to Speech for Custom Voice, ICLR 2021. [Paper]
 • Yi Ren, Chenxu Hu, Xu Tan, Tao Qin, Sheng Zhao, Zhou Zhao, Tie-Yan Liu, FastSpeech 2: Fast and High-Quality End-to-End Text to Speech, ICLR 2021. [Paper] [Blog]
 • Jiawei Chen, Xu Tan, Jian Luan, Tao Qin, Tie-Yan Liu, HiFiSinger: Towards High-Fidelity Neural Singing Voice Synthesis, arXiv 2020. [Paper]
 • Peiling Lu, Jie Wu, Jian Luan, Xu Tan, Li Zhou, XiaoiceSing: A High-Quality and Integrated Singing Voice Synthesis System, INTERSPEECH 2020. [Paper]
 • Mingjian Chen, Xu Tan, Yi Ren, Jin Xu, Hao Sun, Sheng Zhao, Tao Qin, Tie-Yan Liu, MultiSpeech: Multi-Speaker Text to Speech with Transformer, INTERSPEECH 2020. [Paper]
 • Yi Ren, Xu Tan, Tao Qin, Jian Luan, Zhou Zhao, Tie-Yan Liu, DeepSinger: Singing Voice Synthesis with Data Mined From the WebKDD 2020. [Paper]
 • Jin Xu, Xu Tan, Yi Ren, Tao Qin, Jian Li, Sheng Zhao, Tie-Yan Liu, LRSpeech: Extremely Low-Resource Speech Synthesis and RecognitionKDD 2020. [Paper] [Blog]
 • Tomoki Hayashi, Ryuichi Yamamoto, Katsuki Inoue, Takenori Yoshimura, Shinji Watanabe, Tomoki Toda, Kazuya Takeda, Yu Zhang, Xu Tan, ESPnet-TTS: Unified, Reproducible, and Integratable Open Source End-to-End Text-to-Speech Toolkit, ICASSP 2020. [Paper]
 • Yi Ren, Yangjun Ruan, Xu Tan, Tao Qin, Sheng Zhao, Zhou Zhao, Tie-Yan Liu, FastSpeech: Fast, Robust and Controllable Text to Speech, NeurIPS 2019. [Paper] [Demo] [Article] [Reddit]
 • Hao Sun, Xu Tan, Jun-Wei Gan, Sheng Zhao, Dongxu Han, Hongzhi Liu, Tao Qin, and Tie-Yan Liu, Knowledge Distillation from BERT in Pre-training and Fine-tuning for Polyphone Disambiguation, ASRU 2019. [Paper]
 • Hao Sun, Xu Tan, Jun-Wei Gan, Hongzhi Liu, Sheng Zhao, Tao Qin, Tie-Yan Liu, Token-Level Ensemble Distillation for Grapheme-to-Phoneme Conversion, INTERSPEECH 2019. [Paper]
 • Yi Ren, Xu Tan, Tao Qin, Zhou Zhao, Sheng Zhao, Tie-Yan Liu, Almost Unsupervised Text to Speech and Automatic Speech Recognition, ICML 2019. [Paper] [Demo] [Article] [Blog] [Slides] [Video]

People

People

Portrait of Xu Tan

Xu Tan

Principal Research Manager

Portrait of Tao Qin

Tao Qin

Senior Principal Research Manager

Portrait of Rui Wang

Rui Wang

Researcher

Portrait of Renqian Luo

Renqian Luo

Researcher

Portrait of Chang Liu

Chang Liu

Researcher

Portrait of Tie-Yan Liu

Tie-Yan Liu

Distinguished Scientist, Assistant Managing Director