Audio and Acoustics Research Group

Audio and Acoustics Research Group

Established: April 5, 2011


Microsoft Research Blog

Microsoft Research Blog

Microsoft Research Blog

Microsoft Research Blog

Microsoft Research Blog


The Audio and Acoustics group conducts research in audio processing and speech enhancement, 3D audio perception and technologies, devices for audio capture and rendering, array processing, information extraction from audio signals.

The mission of the Audio and Acoustics Group is to develop state of the art algorithms and designs for audio processing, speech enhancement, 3D audio capture and rendering. We also work on the better acoustical design of audio devices, such as microphones and loudspeakers. We conduct research in the area of information retrieval from audio signals, such as speaker identification, emotion detection, etc. Our goal is to create technologies enabling natural interaction with computers with speech and audio. At the same time, we try to impact Microsoft’s current and future offerings in these areas.

Contact for the Audio and Acoustics Research Group is Ivan Tashev.


Arindam Jati, University of Southern California (USC), Los Angeles, USA. Supervised Deep Hashing for Efficient Audio Retrieval.
Benjamin Martinez Elizalde, Carnegie Mellon University, USA. Sound event recognition for video-content analysis.
Fabian Brinkmann, Technical University of Berlin, Germany. Efficient and Perceptually Plausible 3-D Sound for Virtual Reality.
Hakim Si Mohammed, INRIA Rennes, France. Improving the Ergonomics and User-Friendliness of SSVEP-based BCIs in Virtual Reality.
Md Tamzeed Islam, University of North Carolina at Chapel Hill, USA. Anthropometric Feature Estimation using Sensors on Headphone for HRTF Personalization.
Morayo Ogunsina, Penn State Erie, USA. Hearing AI App for Sound-Based User Surrounding Awareness.
Nicholas Huang, Johns Hopkins University, USA. Decoding Auditory Attention Via the Auditory Steady-State Response for Use in A Brain-Computer Interface.
Sahar Hashemgeloogerdi, University of Rochester, USA. Integrating Beamforming and Multichannel Linear Prediction for Dereverberation and Denoising.
Wenkang An, Carnegie Mellon University, USA. Decoding Multisensory Attention from Electroencephalography for Use in a Brain-Computer Interface.
Yangyang (Raymond) Xia, Carnegie Mellon University, USA. Real-time Single-channel Speech Enhancement with Recurrent Neural Networks.

Anderson Avila, Institut National de la Recherche Scientifique (INRS-EMT), Canada. Deep Neural Network Models for Audio Quality Assessment.
Andrea Genovese, New York University Steinhardt, USA. Blind Room Parameter Estimation in Real Time from Single-Channel Audio Signals in Noisy Conditions.
Benjamin Martinez Elizalde, Carnegie Mellon University, USA. A Cross-modal Audio Search Engine based on Joint Audio-Text Embeddings.
Chen Song, University at Buffalo, the State University of New York, USA. Sensor Fusion for Learning-based Motion Estimation in VR.
Christoph F. Hold, Technische Universität Berlin, Germany. Improvements on Higher Order Ambisonics Reproduction in the Spherical Harmonics Domain Under Real-time Constraints.
Harishchandra Dubey, University of Texas at Dallas, USA. MSR-Freesound: Advancing Audio Event Detection & Classification through Efficient Deep Learning Approaches.
Sebastian Braun, Friedrich-Alexander University of Erlangen Nuremberg (FAU), Germany. Speech Enhancement Using Linear and Non-linear Spatial Filtering for Head-mounted Displays.

Etienne Thuillier, Aalto University, Finland. Spatial Audio Feature Discovery Using a Neural Network Classifier.
Xuesu Xiao, Texas A&M University, USA. Articulated Human Pose Tracking with Inertial Sensors.
Srinivas Parthasarathy, University of Texas at Dallas, USA. Speech Emotion Recognition with Convolutional Neural Networks.
Han Zhao, Carnegie Mellon University, USA. High-Accuracy Neural-Network Models for Speech Enhancement.
Jong Hwan Ko, Georgia Institute of Technology, USA. Efficient Neural-Network Design for Real-Time Speech Enhancement.
Rasool Fakoor, University of Texas at Arlington, USA. Speech Enhancement With and Without Gradient Descent.
Yan-hui Tu, University of Science and Technology of China, P. R. China. Regression Based Speech Enhancement with Neural Networks.

Amit Das, University of Illinois at Urbana-Champaign, USA. Ultrasound Based Gesture Recognition.
Vani Rajendran, University of Oxford, UK. Simple Effects that Enhance the Elevation Perception in Spatial Sound.
Zhong-Qiu Wang, Ohio State University. Emotion, gender, and age recognition from speech utterances using neural networks.

Archontis Politis, Aalto University, Finland. Applications of 3-Dimensional Spherical Transforms to Acoustics and Personalization of Head-related Transfer Functions (HRTFs).
Supreeth Krishna Rao, Worcester Polytechnic Institute, USA. Ultrasound Doppler Radar.
Seyedmahdad Mirsamadi, University of Texas at Dallas, USA. DNN-based Online Speech Enhancement Using Multitask Learning and Suppression Rule Estimation.
Long Le, University of Illinois at Urbana-Champaign, USA. Spatial Probability for Sound Source Localization.

Jinkyu Lee, Yonsei University, Korea. Emotion Detection from Speech Signals.
Felicia Lim, Imperial College London, UK. Blind Estimation of Reverberation Parameters.

Ivan Dokmanic, EPFL, Switzerland. Ultrasound Depth Imaging.
Piotr Bilinski, INRIA, France. HRTF Personalization Using Anthropometric Features.
Kun Han, Ohio State University, USA. Emotion Detection from Speech Signals.

Keith Godin, University of Texas at Dallas, USA. Open-set Speaker Identification on Noisy, Short Utterances.
Jason Wung, Georgia Tech, USA. Next Steps in Multi-Channel Acoustic Echo reduction for Xbox Kinect.
Xing Li, University of Washington, USA. Dynamic Loudness Control for In-Car Audio.

Keith Godin, University of Texas at Dallas, USA. Binaural Sound Source Localization.

Hoang Do, Brown University, USA. A Step Towards NUI: Speaker Verification for Gaming Scenarios.

In the News

Microsoft Research Podcast, November 14, 2018
Beyond Tomorrow, December 4, 2017
Ivan Tashev is on the expert panel
VRScout, October 24, 2017
The Science Times, June 28, 2017
The Times, June 27, 2017, June 26, 2017
ScienceDaily, June 26, 2017
TechFest, March 25, 2015
The Guardian, November 7, 2014
The Telegraph, November 6, 2014
Microsoft UK, November 6, 2014
Engadget, November 2, 2016
Singularity Hub, September 28, 2014
Singularity Hub, July 6, 2014
MIT Technology Review, June 4, 2014
Windows Central, June 4, 2014
Microsoft Research Blog, October 16, 2013
Microsoft Research Luminaries, October 16, 2013
ITA 2012, February 14, 2012
Microsoft – The AI Blog, August 1, 2011
Microsoft Research Blog, April 14, 2011
Channel 9 Live at MIX11, April 13, 2011
MIX11, March 15, 2011

Team Life