Customized, Specialized Translation Now a Reality
I can’t read Japanese. I know it when I see it, but what I see is merely a succession of word symbols, indecipherable to my untrained eye. No matter, though, because these days, the Microsoft Translator service enables quick translations from Japanese to English, as easy as copy, paste, and click. I might not be able to read Japanese, but I have a tool at my disposal that enables me to understand documents written in that language.
I can’t read the Nepali language either. But that language is not used by nearly as many people as Japanese, and, therefore, it doesn’t rank high on the list of languages to be added to translation tools. When it comes to understanding Nepali documents, I’m out of luck.
That, however, could change with the commercial availability of the Microsoft Translator Hub, announced in Toronto on July 11 during the Microsoft Worldwide Partner Conference. Now, businesses and communities have the ability to build, train, and deploy customized or built-from-scratch automatic language-translation systems. Those systems could be used to translate languages such as Nepali–or to apply to specialized domains with unique, specific terminology, such as health care, the legal profession, or technology.
Microsoft Translator Hub, which features significant contributions from Microsoft Research, improves the translation quality for domain-specific terminology and style. It provides a translation portal that works with the Microsoft Translator web service—and provides access to Microsoft’s big-data back end, enabling businesses and communities to construct custom systems to deliver the most fluent translations possible with today’s technology.
“The Hub changes the conversation around translation quality within a commercial setting,” says Vikram Dendi, director of Product Strategy and Marketing for the Machine Translation group at Microsoft Research Redmond, “by putting in the hands of businesses powerful tools that allow them to significantly influence how their content can be translated by the Microsoft Translator service.”
For more on the Machine Translator Hub announcement, read Dendi’s blog post.
Powered by Windows Azure, Microsoft Translator Hub learns terminology and style by analyzing previously translated documents and supplementing that knowledge with feedback and corrections. The result extends translation quality far beyond that achieved by generic translation services—and, thus, provides a dramatic increase in the number of scenarios in which automatic translation is both acceptable and useful.
While such a service can be used to enable translation for the world’s many languages not yet supported by major translation providers, it also would appeal to a business interested in customizing Microsoft’s big-data-powered language systems for its own type of data or domain.
Microsoft Research has provided years of world-class expertise to help foster Microsoft Translator Hub. Not least among those contributions is its highly scalable, incremental training capabilities that enable the building of a customized machine-translation system in hours—or even minutes. Anthony Aue, Chris Quirk, and Will Lewis of the Machine Translation team have reduced and compressed translation and target-language models to enable Machine Translator Hub to run efficiently within Bing data-center hardware.
In addition, Microsoft Research Connections has conducted workshops using Microsoft Translator Hub with the Hmong community in Fresno, Calif., and with Nepalese students at Kathmandu University in Nepal.
One of the key research contributions to Microsoft Translator Hub, though, came in response to crisis.
In the wake of the January 2010 earthquake in Haiti, Microsoft volunteers providing assistance asked the Machine Translation team to deliver an English/Haitian Creole translator. That was accomplished, remarkably, within five days, and a few valuable lessons were learned along the way.
“’During the Haitian Creole effort,” Dendi recalls, “while there were many passionate community members, the initiative had to be taken by our team to start building the system, given that the training tools were neither external or simple. While it was quite a collaborative effort, with several external partners providing training data, there was a significant time and resource investment from the Machine Translation team to move the project forward in terms of aligning, cleaning, and evaluating the translation system.”
That experience proved invaluable, and the Hmong people were among the beneficiaries. Using an early, beta version of the Hub, that community received unprecedented access and control over the process of building a customized translation system for the Hmong language.
“They initiated the project, uploaded and cleaned data, trained multiple systems, collaborated to create more data, and improved it continually,” Dendi says. “There are clear parallels to this on the enterprise side, as well.”
Now, with the commercial availability of Microsoft Translator Hub, those parallels will be explored to the utmost, as companies, organizations, and enterprises gain the ability to harness better and specialized translation quality. I’ll probably never learn to read Nepali, but if I find myself confronted with a document written in that language, I just might have somewhere to turn.