How Microsoft Teams uses AI and machine learning to improve calls and meetings
As schools and workplaces begin resuming in-person operations, we project a permanent increase in the volume of online meetings and calls. And while communication and collaboration solutions have played a critical role in enabling continuity during these unprecedented times, early stress tests have revealed opportunities to improve and enhance meeting and call quality.
Disruptive echo effects, poor room acoustics, and choppy video are some common issues that hinder the effectiveness of online calls and meetings. Through AI and machine learning, which have become fundamental to our strategy for continual improvement, we’ve identified and are now delivering innovative enhancements in Microsoft Teams that improve such audio and video challenges in ways that are both user-friendly and scalable across environments.
Today, we’re announcing the availability of new Teams features including echo cancellation, adjusting audio in poor acoustic environments, and allowing users to speak and hear at the same time without interruptions. These build on AI-powered features recently released like expanding background noise suppression.
Voice quality improvements
During calls and meetings, when a participant has their microphone too close to their speaker, it’s common for sound to loop between input and output devices, causing an unwanted echo effect. Now, Microsoft Teams uses AI to recognize the difference between sound from a speaker and the user’s voice, eliminating the echo without suppressing speech or inhibiting the ability of multiple parties to speak at the same time.
“De-reverberation” adjusts for poor room acoustics
In specific environments, room acoustics can cause sound to bounce, or reverberate, causing the user’s voice to sound shallow as if they’re speaking within a cavern. For the first time, Microsoft Teams uses a machine learning model to convert captured audio signal to sound as if users are speaking into a close-range microphone.
Interruptibility, for more natural conversations
A natural element of conversation is the ability to interrupt for clarification or validation. This is accomplished through full-duplex (two-way) transmission of audio, allowing users to speak and hear others at the same time. When not using a headset, and especially when using devices where the speaker and microphone are very close to each other, it is difficult to remove echo while maintaining full-duplex audio. Microsoft Teams uses a model “trained” with 30,000 hours of speech samples to retain desired voices while suppressing unwanted audio signals resulting in more fluid dialogue.
Background noise suppression
Each of us has first-hand experience of a meeting disrupted by the unexpected sounds of a barking dog, a car alarm, or a slammed door. Over two years ago, we announced the release of AI-based noise suppression in Microsoft Teams as an optional feature for Windows users. Since then, we’ve continued a cycle of iterative development, testing, and evaluation to further optimize our model. After recording significant improvements across key user metrics, we have enabled machine learning-based noise suppression as default for Teams customers using Windows (including Microsoft Teams Rooms), as well as Mac and iOS users. A future release of this feature is planned for Teams Android and web clients.
These AI-driven audio enhancements are rolling out and are expected to be generally available in the coming months.
Video quality improvements
We have also recently released AI-based video and screen sharing quality optimization breakthroughs for Teams. From adjustments for low light to optimizations based on the type of content being shared, we now leverage AI to help you look and present your best.
Real-time screen optimization adjusts for the content you’re sharing
The impact of presentations can often depend on an audience’s ability to read on-screen text or watch a shared video. But different types of shared content require varied approaches to ensure the highest video quality, particularly under bandwidth constraints. Teams now uses machine learning to detect and adjust the characteristics of the content presented in real-time, optimizing the legibility of documents or smoothness of video playback.
AI-based optimization ensures your video looks great, even under bandwidth constraints
Unexpected issues with network bandwidth can lead to a choppy video that can quickly shift the focus of your presentation. AI-driven optimizations in Teams help adjust playback in challenging bandwidth conditions, so presenters can use video and screen sharing worry-free.
Brightness and focus filters that put you in the best light
Though you can’t always control the surrounding lighting for your meetings, new AI-powered filters in Teams give you the option to adjust brightness and add a soft focus for your meetings with a simple toggle in your device settings, to better accommodate for low-light environments.
Microsoft Teams: Engineered for clearer audio and fewer distractions
The past two years have made clear how important communication and collaboration platforms like Microsoft Teams are to maintaining safe, connected, and productive operations. In addition to bringing new features and capabilities to Teams, we’ll continue to explore new ways to use technology to make online calling and meeting experiences more natural, resilient, and efficient.
Visit the Tech Community Teams blog for more technical details about how we leverage AI and machine learning for audio quality improvements as well as video and screen sharing optimization in Microsoft Teams.