Multilingual Systems

Established: November 30, 2004

The Multilingual Systems Group explores software technologies to enable seamless content-creation, storage, search, access, and interaction with multiple languages.

Research Overview

At the Multilingual Systems (MLS) group in Microsoft Research India, we believe that Multilingual Information Access is critical for the acquisition, dissemination, exchange, and understanding of knowledge in the global information society. The accelerated growth in the size, content and reach of Internet, the diversity of user demographics and the skew in the availability of information across languages, all point to the increasingly critical need for Multilingual Information Access. We are an interdisciplinary research group focusing on technologies for Multilingual Information Access such as Cross-language Information Retrieval, Multilingual Information Extraction, and Machine Translation that bridge the gap between available information and the user needs transparently across languages.

To this end, we carry out cutting-edge research on a) several aspects of Cross-language Information Retrieval and Multilingual Information Extraction including query expansion, domain adaptation, automatic alignment of multilingual corpora, multilingual named entity extraction, and machine transliteration b) automated and collaborative creation of parallel corpora for Machine Translation and c) fundamental properties of languages and language phenomena including language acquisition and evolution, structural properties of corpora in the framework of complex networks, and interaction between syntax and prosody.

In addition, we are interested in robust fundamentals, especially, annotation standards, data collection efforts and basic tools for research in Indian languages. More importantly, we would like to enable and be a part ofa strong research eco-system in Multilingual Information Access in India.

















Microsoft Research blog

Enhancing Multilingual Content in Wikipedia

By Douglas Gantenbein, Senior Writer, Microsoft News Center Wikipedia has become one of the world’s largest and perhaps most powerful information repositories. But it is heavily English-centric. Making Wikipedia more multilingual inspired a Microsoft Research India team to develop a tool called WikiBhasha, which was launched Oct. 18. WikiBhasha—“Wiki,” signifying its community-oriented approach; “Bhasha,” a Sanskrit word meaning “language”—features a content-creation platform that combines linguistic services, such as machine translation, with a Wikipedia-friendly content editor.…

October 2010

Microsoft Research Blog