Portrait of Frank Soong

Frank Soong

Principal Researcher and Manager, Speech Group

About

Frank Soong is a Principal Researcher and Manager of the Speech Group, where speech modeling, recognition, synthesis research is conducted.

Projects

Photo-Real Talking Head with Deep Bidirectional LSTM

Established: September 29, 2014

Purpose We propose to use deep bidirectional LSTM for audio/visual modeling in our photo-real talking head system. Abstract Long Short-Term Memory (LSTM) is a specific recurrent neural network (RNN) architecture that was designed to model temporal sequences and…

Voice Conversion with Neural Network

Established: March 24, 2014

Sequence Error (SE) Minimization Training of Neural Network for Voice Conversion Neural network (NN) based voice conversion, which employs a nonlinear function to map the features from a source to a target speaker, has been shown to…

Turning a Monolingual Speaker Into Multi-Lingual Speaker

Established: February 21, 2012

Voice user interface needs to output responses in Text-To-Speech (TTS) synthesized speech. Sometimes it is even more desirable to have the response in mixed languages, For example, in a foreign country, it would be convenient if a user of car-navigation…

Publications

2012

2006

2005

Other

He received his BS, MS and Ph. D, all in EE from the National Taiwan University, the University of Rhode Island and Stanford University, respectively. He joined Bell Labs Research, Murray Hill, NJ, USA in 1982, worked there for 20 years and retired as a Distinguished Member of Technical Staff in 2001. In Bell Labs, he had worked on various aspects of acoustics and speech processing, including: speech coding, speech and speaker recognition, stochastic modeling of speech signals, efficient search algorithms, discriminative training, dereverberation of audio and speech signals, microphone array processing, acoustic echo cancellation, hands-free noisy speech recognition. He was also responsible for transferring recognition technology from research to AT&T voice-activated cell phones which were rated by the Mobile Office Magazine as the best among competing products evaluated. He was the co-recipient of the Bell Labs President Gold Award for developing the Bell Labs Automatic Speech Recognition (BLASR) software package. He visited Japan twice as a visiting researcher: first from 1987 to 1988, to the NTT Electro-Communication Labs, Musashino, Tokyo; then from 2002-2004, to the Spoken Language Translation Labs, ATR, Kyoto. In 2004, he joined Microsoft Research Asia (MSRA), Beijing, China to lead the Speech Research Group. He is a visiting professor of the Chinese University of Hong Kong (CUHK) and the co-director of CUHK-MSRA Joint Research Lab, recently promoted to a National Key Lab of Ministry of Education, China. He was the co-chair of the 1991 IEEE International Arden House Speech Recognition Workshop. He is a committee member of the IEEE Speech and Language Processing Technical Committee of the Signal Processing Society and has served as an associate editor of the Transactions of Speech and Audio Processing. He published extensively and coauthored more than 200 technical papers in the speech and signal processing fields. He is an IEEE Fellow.

Speech Group’s home page.