Teaching Computers to Speak in Tongues

Published

Posted by Kevin Schofield

 

Perhaps it’s just me, but I wince whenever my American phone or GPS tries to pronounce a French restaurant name or a Spanish street name, or the name of one of my non-American friends. While we’ve made much progress in creating text-to-speech (TTS) systems with human-sounding voices in a comforting accent, they haven’t fared well in our multilingual world.

Microsoft Research Blog

Microsoft at CHI 2024: Innovations in human-centered design

From immersive virtual experiences to interactive design tools, Microsoft Research is at the frontier of exploring how people engage with technology. Discover our latest breakthroughs in human-computer interaction research at CHI 2024.

Frank Soong and his team from Microsoft Research Asia have been working to solve that problem by “cross-training” a text-to-speech system  so that it can correctly pronounce words from multiple languages even if it was built from the voice samples of someone who only speaks one.

Frank gave a demo of their TTS system during Rick Rashid’s keynote speech at Techfest yesterday. You can see it in the video below.

Get Microsoft Silverlight (opens in new tab)

 

This is another great example of how the next generation of technologies will continue to make interacting with computers more natural and help to more seamlessly blend together physical and virtual elements.