Microsoft Research Blog

The Microsoft Research blog provides in-depth views and perspectives from our researchers, scientists and engineers, plus information about noteworthy events and conferences, scholarships, and fellowships designed for academic and scientific communities.

More Research Contributions to Windows Phone Translation

May 14, 2012 | By Microsoft blog editor

Posted by Rob Knies

 Translator App logo

Recently, I wrote about Microsoft Research contributions to the new Translator App for Windows Phone, available for free download on Windows Phone Marketplace.

Some of them, anyway.

As it turns out, the story goes a bit deeper than the earlier post was able to convey. Since its publication, I have connected with a couple of researchers from Microsoft Research Asia who told me of other ways—some of long standing, some new— that research has enabled and extended the Translator App.

First, the foundational part: The Natural Language Computing (NLC) Group, managed by Ming Zhou, senior researcher at the Asia lab.

“The app,” Zhou says, “is powered by the same state-of-the-art technology used in Bing Translator. Our group has seamlessly worked together with the Translator group for many years on translation engines for language pairs of Chinese-English and Japanese-English.”

Indeed, the Chinese-English machine-translation engine in Bing Translator was a joint effort from Microsoft Research Asia’s NLC group and its Innovation Engineering group, managed by Jonathan Tien, in cross-lab collaboration with the Machine Translation Incubation team and the Natural Language Processing group at Microsoft Research Redmond.

And that’s not all.

“We contributed a large-scale web-mining tool kit and the mined parallel data in the general and travel domains on English-Chinese language pairs,” Zhou says. “Those have been used with other data sources for the training of English-Chinese machine translation in Translator.”

The Asia lab’s contributions to the Translator App’s Chinese optical-character-recognition (OCR) engine is a bit more recent, as explained by Qiang Huo, research manager of the Multimodal Interaction Technology team within the Speech Group at the Asia lab.

“We developed a solution for a compact, accurate, and efficient Chinese-character classifier,” he says, “and shared the relevant knowledge and code with the OCR product team. And we developed the relevant training tools and shared them with the product team to develop the final, client-side Chinese OCR engine shipped in the Translator App.

“We also pioneered and demonstrated the coolness and the feasibility of an OCR translation app for Windows Phone.”

The client-side OCR engine, in particular, elicited raised eyebrows in a Translator App demo illustrating its use in translating a menu written in Chinese for a hungry customer unable to read Chinese.

That work began in July 2010 with a project called Snap and Translate Using Windows Phone, which became a demo shown in March 2011 during TechFest, Microsoft Research’s annual technology showcase. The basic idea is to enable a user to use a Windows Phone to snap an image of a block of text, highlight a portion of the text with a tap or a swipe, and get an instant translation. Colleagues such as Jun Du, Lei Sun, Matt Scott, Gang Chen, and Jian Sun  were instrumental in developing this demo, as was the OCR product team.

The TechFest demo proved a hit with the right audience.

“Our partners on the Bing Mobile Augmented Reality team decided to develop an OCR translation app for Windows Phone,” Huo says, “and the source code for our demo was transferred to the Serbia-based OCR product team for their reference.”

The transfer later was extended to support both text and speech input. Then, in June 2011, the OCR product team asked Huo to develop a new, client-side OCR solution for Simplified Chinese. He quickly designed and developed, with colleagues Du and Kai Chen, a practical solution including the use of a multiprototype-based classifier for efficiency, a new, discriminative-training approach for achieving high recognition accuracy, a model compression technique to get an appropriately small footprint, and a fast-match tree to reduce latency.

Huo is delighted with what they were able to achieve by working with the OCR product team.

“The importance of the Chinese market leads to a strong demand for a Chinese OCR-input capability in the Translator App,” he says. “Our contributions make it possible to do client-side Chinese OCR, which will delight our users when they do not have a network connection yet still will be able to use OCR translation.”