Audio and Acoustics Research Group

The Audio and Acoustics group conducts research in audio processing and speech enhancement, 3D audio perception and technologies, devices for audio capture and rendering, array processing, information extraction from audio signals.

The mission of the Audio and Acoustics Group is to develop state of the art algorithms and designs for audio processing, speech enhancement, 3D audio capture and rendering. We also work on the better acoustical design of audio devices, such as microphones and loudspeakers. We conduct research in the area of information retrieval from audio signals, such as speaker identification, emotion detection, etc. Our goal is to create technologies enabling natural interaction with computers with speech and audio. At the same time, we try to impact Microsoft’s current and future offerings in these areas.

Contact for the Audio and Acoustics Research Group is Ivan Tashev.







Etienne Thuillier, Aalto University, Finland. Spatial Audio Feature Discovery Using a Neural Network Classifier.
Xuesu Xiao, Texas A&M University, USA. Articulated Human Pose Tracking with Inertial Sensors.
Srinivas Parthasarathy, University of Texas at Dallas, USA. Speech Emotion Recognition with Convolutional Neural Networks.
Han Zhao, Carnegie Mellon University, USA. High-Accuracy Neural-Network Models for Speech Enhancement.
Jong Hwan Ko, Georgia Institute of Technology, USA. Efficient Neural-Network Design for Real-Time Speech Enhancement.
Rasool Fakoor, University of Texas at Arlington, USA. Speech Enhancement With and Without Gradient Descent.
Yan-hui Tu, University of Science and Technology of China, P. R. China. Regression Based Speech Enhancement with Neural Networks.

Amit Das, University of Illinois at Urbana-Champaign, USA. Ultrasound Based Gesture Recognition.
Vani Rajendran, University of Oxford, UK. Simple Effects that Enhance the Elevation Perception in Spatial Sound.
Zhong-Qiu Wang, Ohio State University. Emotion, gender, and age recognition from speech utterances using neural networks.

Archontis Politis, Aalto University, Finland. Applications of 3-Dimensional Spherical Transforms to Acoustics and Personalization of Head-related Transfer Functions (HRTFs).
Supreeth Krishna Rao, Worcester Polytechnic Institute, USA. Ultrasound Doppler Radar.
Seyedmahdad Mirsamadi, University of Texas at Dallas, USA. DNN-based Online Speech Enhancement Using Multitask Learning and Suppression Rule Estimation.

Jinkyu Lee, Yonsei University, Korea. Emotion Detection from Speech Signals.
Felicia Lim, Imperial College London, UK. Blind Estimation of Reverberation Parameters.

Ivan Dokmanic, EPFL, Switzerland. Ultrasound Depth Imaging.
Piotr Bilinski, INRIA, France. HRTF Personalization Using Anthropometric Features.
Kun Han, Ohio State University, USA. Emotion Detection from Speech Signals.

Keith Godin, University of Texas at Dallas, USA. Open-set Speaker Identification on Noisy, Short Utterances.
Jason Wung, Georgia Tech, USA. Next Steps in Multi-Channel Acoustic Echo reduction for Xbox Kinect.
Xing Li, University of Washington, USA. Dynamic Loudness Control for In-Car Audio.

Keith Godin, University of Texas at Dallas, USA. Binaural Sound Source Localization.

Hoang Do, Brown University, USA. A Step Towards NUI: Speaker Verification for Gaming Scenarios.

