Translator Community Partners
The Centre for Global Intelligent Content is funded by the Irish Government to revolutionise the way people interact with digital content, systems and each other to achieve new levels of access, efficiency and empowerment. The CNGL, led by Prof. Vincent Wade at Trinity College Dublin, is focused on the creation, processing, unification and seamless integration of multilingual, multi-modal and multimedia content. The CNGL Machine Translation (MT) group at Dublin City University is led by Prof. Andy Way and Prof. Qun Liu. Prof. Way is Deputy Director of the CNGL, and Prof. Liu leads research in translation and localisation.
The Hmong language is spoken by over 6.5 million people around the world, mainly in Southeast Asia and China, and communities in the United States, Australia, and France. Hmong has a very long oral tradition, but the writing system was developed just 60 years ago. The greater Hmong community has the goal of long-term preservation of the language, and sees machine translation as a means to foster its enduring growth. The following Hmong Language Partners page participated in making the Hmong translator a reality.
- Stone Soup
- Fresno Unified School District
- 3Hmong Publishing
- Jay Xiong and Hmongdictionary.com
- California State University Fresno College of Arts and Humanities
- Joe Fries
- The Hmong community
The Otomi Translator is a project of great importance not only for the State of Queretaro, but throughout Mexico and the world. For the first time, this endangered native language will be available in a machine translation system. The Otomi Translator is an important contribution to help rescue and revitalize the language, culture, and identity of the Otomi people. Native and non-native speakers who are interested in communicating with the Otomi people, may use the Microsoft Otomi Translator. It will be very useful for the Otomi people to learn how to write in their native language. This technology will encourage intercultural communication and allow other cultures to experience the Otomi people, their worldview, and their culture.
Urdu is one of the 22 scheduled languages of India, and the national lingua franca in Pakistan. Together with Hindi, the speakers constitute the fourth largest linguistic community in the world. Dr. Girish Nath Jha, Professor of Computational Linguistics and Dean at the School of Sanskrit and Indic Studies, JNU, New Delhi, has been helping Microsoft since 2006. With the help from his research students, staff and the other language centers, he set up an Urdu enthusiasts group at JNU which focuses on corpora collection for training MT system and building basic tools for Urdu. With Microsoft Research, the group organized a workshop to sensitize the community in creating resources for the language. The group has been helping Microsoft Translator to develop the English-Urdu Translation system.
Swahili is a major lingua franca spoken in East Africa, and is the language of business and commerce for more than 150 million people. It is a Bantu language which has its origins among the Swahili people of Tanzania, Kenya and Mozambique. However, the language now has official status in Tanzania, Kenya, Uganda the Democratic Republic of the Congo, Comoros, Zanzibar and the African Union. Translators without Borders Kenya (TWBK), a field office of the US-based Translators without Borders, worked with Microsoft on the Swahili Translator as a means to professionalize and standardize Swahili translations, especially in the areas of health and crisis relief, as part of the Words of Relief crisis relief translation network. TWB’s mission is to increase access to vital knowledge using language and translation. Learn more about TWB’s Words of Relief program and digital exchange application for connecting aid workers to TWB’s rapid response translator by following this link.
Established in 1991, Tilde is a leading Baltic IT company specializing in localization, multilingual and Internet software, and a leader in innovation for Baltic languages in state of the art language technologies. Tilde’s aim is to provide language technologies for the languages of the Baltic countries, specifically Latvian, Lithuanian, and Estonian, that would be equivalent to the support for the major languages of the world. Tilde is a privately owned company with offices in Riga, Tallinn and Vilnius.
The goal of the Maya Translator is to document and preserve the cultural and linguistic heritage of the Maya people for future generations. It uses state-of-the-art technology in order to connect Maya communities with other cultures worldwide. The first of its kind in Mexico, it pioneers online translation for Maya. The Microsoft Translator Hub was used by the Maya communities to develop their own language translation models, while empowering and connecting them with the world. The Maya translation systems are still in an early stage of development and those individuals who would like to contribute their efforts are encouraged to join the team. If you are a language policy maker, educator, language teacher, or a community member who is interested in helping to preserve the beautiful Maya language for future generations, please visit our site to learn more.
Filipino, Fijian, Malagasy, Samoan, Tahitian and Tongan are the six Austronesian languages developed in partnership with Appen, LDS and Microsoft for Microsoft Translator text API. Austronesian languages are spoken across a broad geography, from Madagascar to the Polynesian Islands, to New Zealand and to South China, with more than 119 million Austronesian speakers across the globe.
Learn more about the Church of Jesus Christ of Latter-day Saints, their 15 million members in 176 countries, and their publications in 188 languages. Machine translation has become an integral part of supporting the Church’s work.
Māori are the indigenous Polynesian people of New Zealand. Their language, spoken for over 1000 years has been officially recognised in New Zealand since 1987, alongside English and New Zealand Sign Language. Interestingly, te reo Māori is one of the first indigenous languages worldwide to be modelled with Microsoft’s innovative Neural Machine Translation (NMT) techniques, which can be more accurate than statistical translation models. Fifteen percent of the New Zealand population is Māori yet only a quarter of people who identify as Māori speak te reo Māori, and only three percent of all people living in New Zealand speak it. New Zealand is on a journey, a cultural renaissance where te reo Māori is finding confidence in a modern technological context.
Through Microsoft’s AI for Cultural Heritage program, Waikato University and Auckland University of Technology are key organisations helping to revitalise our indigenous language and the legacy we treasure. The creation of the translated corpus contributes towards the globally accessible Microsoft Translator for te reo Māori. Our work is integral to both the preservation of historical data, and the continued use of the language across the world by encouraging the translation of te reo Māori into many languages, spreading and amplifying New Zealand culture. In early 2019, the New Zealand government pledged to ensure 1 million people can speak basic te reo Māori by 2040. At the time of the announcement, New Zealand’s population was 4.9 million.
Inuktitut is the primary dialect of the Inuktut language; it is spoken by approximately 40,000 Inuit across Inuit Nunangat, the Inuit homeland in Northern Canada and used by 70 percent of Nunavut’s residents. Inuinnaqtun, also a dialect of Inuktut is on UNESCO’s list of endangered languages. Inuinnaqtun is the mother tongue of fewer than 600 people concentrated mostly in the Kugluktuk and Cambridge Bay communities in the Kitikmeot region of Nunavut.
The Upper Sorbian language community donated data which was instrumental in adding the Upper Sorbian language to Translator. Upper Sorbian is a Slavic language spoken by about 25,000 people in the East German region known as Lusatia, where it is recognized as an official second language. The Upper Sorbian language is also recognized as a protected minority language of Germany.
Klingon is a constructed language invented for use within the Star Trek universe, with a large fan following around the world. The Klingon Language Institute (KLI), Paramount Pictures, and CBS Consumer Products were immensely helpful in the addition of Klingon as a supported language for machine translation.
Klingon is a trademark of CBS Studios Inc.
In operation since 1992, the Klingon Language Institute is a nonprofit 501(c)3 corporation and exists to facilitate the scholarly exploration of the Klingon language and culture. In operation since 1992, the Klingon Language Institute continues its mission of bringing together individuals interested in the study of Klingon linguistics and culture, and providing a forum for discussion and the exchange of ideas.
The linguistic expertise of members of the Klingon Language Institute, Professor Marc Okrand (the inventor of the Klingon language), and Dr. Lawrence Schoen were instrumental towards the addition of Klingon. Using the Translator Hub, members of the community are able to review, critique and correct the translation errors and retrain the engine for continued improvement.
Paramount Pictures Corporation is a film and television production/distribution studio, consistently ranked as one of the largest (top-grossing) movie studios. It is a unit of U.S. mass media company Viacom. Paramount Pictures is a member of the Motion Picture Association of America (MPAA).
CBS Consumer Products manages worldwide licensing and merchandising for a diverse slate of television brands and series from CBS, CBS Television Studios and CBS Television Distribution, as well as from the company’s extensive library of titles, Showtime and CBS Films. Additionally, the group oversees online sales of programming merchandise. For more information, visit: www.cbsconsumerproducts.com.