Portrait of Sunayana Sitaram

Sunayana Sitaram

Senior Researcher


Hello, and thanks for stopping by! I’m a Senior Researcher at Microsoft Research India where I work on Speech Processing for Multilingual Communities. Currently, the focus of my work is on discovering techniques to build NLP and speech processing systems capable of handling code-switching without needing a large amount of code-switched training data. I also work on speech technologies for low resource languages. For up-to-date information about publications, please take a look at my Google Scholar page.

At MSRI, I am part of Project Mélange in which we look at various aspects of code-switching and mixing, including how and why multilinguals code-switch. I have also been an organizer of the Interspeech Special Session on Code-switching from 2017-2019 which we will organize as a workshop co-located with Interspeech 2020.

My research goal is to make all the content in the world available to all the people in the world, regardless of the language they speak, their level of education, their age, gender, and their special needs. So far, my main expertise has been in multilingual systems, particularly in dealing with languages that have very few linguistic resources.

*NEW* Datasets

Code-switched data for the Language Identification shared task organized as part of the First Workshop on Speech Technologies for Code-switching for Multilingual Communities is now available for research use.

I also organized a shared task on ASR for low resource languages in a special session at Interspeech 2018, and we released data from three low-resource Indian languages as part of this challenge which is now available for research use.

Prior to coming to MSR India

I finished my PhD in December 2016 at the Language Technologies Institute, Carnegie Mellon University. I worked on Text-to-Speech systems with my advisor Alan W Black, and my thesis was on pronunciation modeling for low-resource languages. From 2010-2012, I was a Masters student at CMU with Jack Mostow, and I worked on children’s oral reading prosody. I also interned with Microsoft Research India in Summer 2012 and we built a low-vocabulary ASR system for farmers in rural central India.