Multilingual Modulation by Neural Language Codes

  • Markus Müller | Karlsruhe Institute of Technology

Multilingual Speech Recognition is a very costly AI problem, as each language and even different accents require their own acoustic models to obtain optimal recognition performance. Even by using the same phone symbols across languages, each language and even accents impose their own colorings or “twangs”, a shift in the acoustic realization of sounds. In this talk, I will outline an approach that uses a large multilingual neural network that is modulated by language codes. These codes are generated by an ancillary network that learns to code useful differences between the “twangs” or human languages. This network architecture allows the quick adaptation to languages.

Speaker Details

Markus Müller received his Master degree in July 2012 from the Karlsruhe Institute of Technology (KIT) and defended his PhD dissertation entitled “Multilingual Modulation by Neural Language Codes” at the KIT in June 2018. His research interests are deep neural networks and automatic speech recognition. He looks into the inner workings of neural networks and how they can be stimulated to learn certain features. In speech recognition, he focuses on low-resource conditions and multilingual scenarios.

Series: Microsoft Research Talks