Audio and Acoustics Research Group

Audio and Acoustics Research Group

Established: April 5, 2011





The Audio and Acoustics group conducts research in audio processing and speech enhancement, 3D audio perception and technologies, devices for audio capture and rendering, array processing, information extraction from audio signals.

The mission of the Audio and Acoustics Group is to develop state of the art algorithms and designs for audio processing, speech enhancement, 3D audio capture and rendering. We also work on the better acoustical design of audio devices, such as microphones and loudspeakers. We conduct research in the area of information retrieval from audio signals, such as speaker identification, emotion detection, etc. Our goal is to create technologies enabling natural interaction with computers with speech and audio. At the same time, we try to impact Microsoft’s current and future offerings in these areas.

Contact for the Audio and Acoustics Research Group is Ivan Tashev.


Anderson Avila, Institut National de la Recherche Scientifique (INRS-EMT), Canada. Deep Neural Network Models for Audio Quality Assessment.
Andrea Genovese, New York University Steinhardt, USA. Blind Room Parameter Estimation in Real Time from Single-Channel Audio Signals in Noisy Conditions.
Benjamin Martinez Elizalde, Carnegie Mellon University, USA. A Cross-modal Audio Search Engine based on Joint Audio-Text Embeddings.
Chen Song, University at Buffalo, the State University of New York, USA. Sensor Fusion for Learning-based Motion Estimation in VR.
Christoph F. Hold, Technische Universität Berlin, Germany. Improvements on Higher Order Ambisonics Reproduction in the Spherical Harmonics Domain Under Real-time Constraints.
Harishchandra Dubey, University of Texas at Dallas, USA. MSR-Freesound: Advancing Audio Event Detection & Classification through Efficient Deep Learning Approaches.
Sebastian Braun, Friedrich-Alexander University of Erlangen Nuremberg (FAU), Germany. Speech Enhancement Using Linear and Non-linear Spatial Filtering for Head-mounted Displays.

Etienne Thuillier, Aalto University, Finland. Spatial Audio Feature Discovery Using a Neural Network Classifier.
Xuesu Xiao, Texas A&M University, USA. Articulated Human Pose Tracking with Inertial Sensors.
Srinivas Parthasarathy, University of Texas at Dallas, USA. Speech Emotion Recognition with Convolutional Neural Networks.
Han Zhao, Carnegie Mellon University, USA. High-Accuracy Neural-Network Models for Speech Enhancement.
Jong Hwan Ko, Georgia Institute of Technology, USA. Efficient Neural-Network Design for Real-Time Speech Enhancement.
Rasool Fakoor, University of Texas at Arlington, USA. Speech Enhancement With and Without Gradient Descent.
Yan-hui Tu, University of Science and Technology of China, P. R. China. Regression Based Speech Enhancement with Neural Networks.

Amit Das, University of Illinois at Urbana-Champaign, USA. Ultrasound Based Gesture Recognition.
Vani Rajendran, University of Oxford, UK. Simple Effects that Enhance the Elevation Perception in Spatial Sound.
Zhong-Qiu Wang, Ohio State University. Emotion, gender, and age recognition from speech utterances using neural networks.

Archontis Politis, Aalto University, Finland. Applications of 3-Dimensional Spherical Transforms to Acoustics and Personalization of Head-related Transfer Functions (HRTFs).
Supreeth Krishna Rao, Worcester Polytechnic Institute, USA. Ultrasound Doppler Radar.
Seyedmahdad Mirsamadi, University of Texas at Dallas, USA. DNN-based Online Speech Enhancement Using Multitask Learning and Suppression Rule Estimation.

Jinkyu Lee, Yonsei University, Korea. Emotion Detection from Speech Signals.
Felicia Lim, Imperial College London, UK. Blind Estimation of Reverberation Parameters.

Ivan Dokmanic, EPFL, Switzerland. Ultrasound Depth Imaging.
Piotr Bilinski, INRIA, France. HRTF Personalization Using Anthropometric Features.
Kun Han, Ohio State University, USA. Emotion Detection from Speech Signals.

Keith Godin, University of Texas at Dallas, USA. Open-set Speaker Identification on Noisy, Short Utterances.
Jason Wung, Georgia Tech, USA. Next Steps in Multi-Channel Acoustic Echo reduction for Xbox Kinect.
Xing Li, University of Washington, USA. Dynamic Loudness Control for In-Car Audio.

Keith Godin, University of Texas at Dallas, USA. Binaural Sound Source Localization.

Hoang Do, Brown University, USA. A Step Towards NUI: Speaker Verification for Gaming Scenarios.

In the News

Hearing in 3D with Dr. Ivan Tashev
Microsoft Research Podcast, November 14, 2018

Beyond Tomorrow – A vision study by Brüel & Kjær
Beyond Tomorrow, December 4, 2017
Ivan Tashev is on the expert panel

Is Sound the Secret Sauce for Making Immersive Experiences?
VRScout, October 24, 2017

Listeners Seeing What They Hear: Virtual Reality & 3D Acoustics Integration
The Science Times, June 28, 2017

3D sound to let you hear Walking Dead zombies first
The Times, June 27, 2017

Researchers use head related transfer functions to personalize audio in mixed and virtual reality, June 26, 2017

Creating a personalized, immersive audio environment
ScienceDaily, June 26, 2017

Be There: 3D Audio Virtual Presence (video)
TechFest, March 25, 2015

Headset provides ‘3D soundscape’ to help blind people navigate cities
The Guardian, November 7, 2014

How 3D audio technology could ‘unlock’ cities for blind people
The Telegraph, November 6, 2014

Cities Unlocked: Lighting up the world through sound (video)
Microsoft UK, November 6, 2014

3D audio is the secret to HoloLens’ convincing holograms
Engadget, November 2, 2016

Virtual Reality May Become the Next Great Media Platform—But Can It Fool All Five Senses?
Singularity Hub, September 28, 2014

What’s Missing from Virtual Reality? Immersive 3D Soundscapes
Singularity Hub, July 6, 2014

Microsoft’s “3-D Audio” Gives Virtual Objects a Voice
MIT Technology Review, June 4, 2014

Microsoft 3D audio tech makes virtual sounds sound real
Windows Central, June 4, 2014

Audio Advances Help Xbox One Determine Signal from Noise
Microsoft Research Blog, October 16, 2013

Ivan Tashev Helps Makes Microsoft Sound Great (video)
Microsoft Research Luminaries, October 16, 2013

Keynote – Ivan Tashev Optimizing Kinect: Audio and Acoustics (video)
ITA 2012, February 14, 2012

Tellme and the Voice of Kinect
Microsoft – The AI Blog, August 1, 2011

Kinect Audio: Preparedness Pays Off
Microsoft Research Blog, April 14, 2011

MSR NUI Panel with Curtis Wong & Ivan Tashev (video)
Channel 9 Live at MIX11, April 13, 2011

Audio for Kinect: From Idea to “Xbox, Play!” (video)
MIX11, March 15, 2011

Team Life