More Research Contributions to Windows Phone Translation

Published May 14, 2012

Share this page

Posted by Rob Knies

Recently, I wrote about Microsoft Research contributions (opens in new tab) to the new Translator App for Windows Phone (opens in new tab), available for free download on Windows Phone Marketplace (opens in new tab).

Some of them, anyway.

As it turns out, the story goes a bit deeper than the earlier post was able to convey. Since its publication, I have connected with a couple of researchers from Microsoft Research Asia (opens in new tab) who told me of other ways—some of long standing, some new— that research has enabled and extended the Translator App.

First, the foundational part: The Natural Language Computing (opens in new tab) (NLC) Group, managed by Ming Zhou (opens in new tab), senior researcher at the Asia lab.

“The app,” Zhou says, “is powered by the same state-of-the-art technology used in Bing Translator (opens in new tab). Our group has seamlessly worked together with the Translator group for many years on translation engines for language pairs of Chinese-English and Japanese-English.”

Indeed, the Chinese-English machine-translation engine in Bing Translator was a joint effort from Microsoft Research Asia’s NLC group and its Innovation Engineering (opens in new tab) group, managed by Jonathan Tien (opens in new tab), in cross-lab collaboration with the Machine Translation Incubation team and the Natural Language Processing (opens in new tab) group at Microsoft Research Redmond (opens in new tab).

And that’s not all.

“We contributed a large-scale web-mining tool kit and the mined parallel data in the general and travel domains on English-Chinese language pairs,” Zhou says. “Those have been used with other data sources for the training of English-Chinese machine translation in Translator.”

The Asia lab’s contributions to the Translator App’s Chinese optical-character-recognition (OCR) engine is a bit more recent, as explained by Qiang Huo (opens in new tab), research manager of the Multimodal Interaction Technology team within the Speech Group (opens in new tab) at the Asia lab.

“We developed a solution for a compact, accurate, and efficient Chinese-character classifier,” he says, “and shared the relevant knowledge and code with the OCR product team. And we developed the relevant training tools and shared them with the product team to develop the final, client-side Chinese OCR engine shipped in the Translator App.

“We also pioneered and demonstrated the coolness and the feasibility of an OCR translation app for Windows Phone.”

The client-side OCR engine, in particular, elicited raised eyebrows in a Translator App demo illustrating its use in translating a menu written in Chinese for a hungry customer unable to read Chinese.

That work began in July 2010 with a project called Snap and Translate Using Windows Phone, which became a demo shown in March 2011 during TechFest, Microsoft Research’s annual technology showcase. The basic idea is to enable a user to use a Windows Phone to snap an image of a block of text, highlight a portion of the text with a tap or a swipe, and get an instant translation. Colleagues such as Jun Du (opens in new tab), Lei Sun, Matt Scott (opens in new tab), Gang Chen, and Jian Sun (opens in new tab) were instrumental in developing this demo, as was the OCR product team.

The TechFest demo proved a hit with the right audience.

“Our partners on the Bing Mobile Augmented Reality team decided to develop an OCR translation app for Windows Phone,” Huo says, “and the source code for our demo was transferred to the Serbia-based OCR product team for their reference.”

The transfer later was extended to support both text and speech input. Then, in June 2011, the OCR product team asked Huo to develop a new, client-side OCR solution for Simplified Chinese. He quickly designed and developed, with colleagues Du and Kai Chen, a practical solution including the use of a multiprototype-based classifier for efficiency, a new, discriminative-training approach for achieving high recognition accuracy, a model compression technique to get an appropriately small footprint, and a fast-match tree to reduce latency.

Huo is delighted with what they were able to achieve by working with the OCR product team.

“The importance of the Chinese market leads to a strong demand for a Chinese OCR-input capability in the Translator App,” he says. “Our contributions make it possible to do client-side Chinese OCR, which will delight our users when they do not have a network connection yet still will be able to use OCR translation.”

Microsoft Research Blog

What’s Your Story: Weishung Liu