Multilingual Systems

The Multilingual Systems Group explores software technologies to enable seamless content-creation, storage, search, access, and interaction with multiple languages.

At the Multilingual Systems (MLS) group in Microsoft Research India, we believe that Multilingual Information Access is critical for the acquisition, dissemination, exchange, and understanding of knowledge in the global information society. The accelerated growth in the size, content and reach of Internet, the diversity of user demographics and the skew in the availability of information across languages, all point to the increasingly critical need for Multilingual Information Access. We are an interdisciplinary research group focusing on technologies for Multilingual Information Access such as Cross-language Information Retrieval, Multilingual Information Extraction, and Machine Translation that bridge the gap between available information and the user needs transparently across languages.

To this end, we carry out cutting-edge research on a) several aspects of Cross-language Information Retrieval and Multilingual Information Extraction including query expansion, domain adaptation, automatic alignment of multilingual corpora, multilingual named entity extraction, and machine transliteration b) automated and collaborative creation of parallel corpora for Machine Translation and c) fundamental properties of languages and language phenomena including language acquisition and evolution, structural properties of corpora in the framework of complex networks, and interaction between syntax and prosody.

In addition, we are interested in robust fundamentals, especially, annotation standards, data collection efforts and basic tools for research in Indian languages. More importantly, we would like to enable and be a part of a strong research eco-system in Multilingual Information Access in India.


Portrait of Kalika Bali

Kalika Bali

Principal Researcher