Microsoft Research Blog

The Microsoft Research blog provides in-depth views and perspectives from our researchers, scientists and engineers, plus information about noteworthy events and conferences, scholarships, and fellowships designed for academic and scientific communities.

Combating the Chinglish Scourge

October 26, 2011 | By Microsoft blog editor

Posted by Rob Knies

Example of Chinglish

“Carefully hits to the forehead” instead of “Watch your head”? Well, I’m sure I’ve written things like that on occasion myself, and perhaps you have, too. Still, it’s not exactly the king’s English, is it?

In China, such usages are referred to as “Chinglish,” unfortunate turns of phrase created by translators whose eagerness to communicate across language barriers outstrips their abilities to do so (even if their English remains infinitely more adept than my Mandarin). At best, it can be a source of merrymaking; at worst, an embarrassment. What can be done?

Matt Scott has a solution. Scott, senior development lead in the Innovation Engineering group at Microsoft Research Asia, is the project lead for Engkoo, a technology for exploring and learning language, and he has particular experience in trying to rid China of such aberrant coinages.

“As both the English and Chinese languages continue to rapidly expand and alter their vocabularies in the Internet age,” he says, “without fresh and authentic bilingual learning materials, the translation gap grows. A byproduct of the challenges Chinese speakers have in learning English can be seen in the rise of Chinglish, which is largely considered an erroneous interlanguage.”

He places part of the blame on stale Chinese-English dictionaries and low-quality English-language textbooks, but instead of railing against such failings, he is intent on correcting them. In fact, Engkoo, which now powers the Bing Dictionary service in China, was adopted by World Expo 2010 Shanghai organizers to help clean up Chinglish billboards across the city.

“Hundreds of Shanghai students snapped Chinglish pictures throughout the city,” Scott says, “and uploaded about a thousand examples to a collection website that offers Engkoo translations.”

The students were motivated to do so via prizes provided by MSN China. They also made use of social-networking features to vote on the most outrageous examples of Chinglish. That enabled the Engkoo team and users to prioritize the most egregious usages, and Bing Maps helped identify the location of the billboards so they could be updated.

The Engkoo service benefited, as well.

“Our editors, engineers, and researchers scour the site,” he says, “to learn the different kinds of Chinglish and improve our web mining and translation algorithms.”

The effort gained the attention of China Daily, the largest English newspaper in China, but that’s not unusual. Engkoo has received all sorts of press coverage in the Chinese media—and beyond. A paper titled Towards a Specialized Search Engine for Language Learners appeared last month in Proceedings of the IEEE. Engkoo has been covered by The Wall Street Journal, Popular Science, Engadget, and PC World, and, in 2010, the project won The Wall Street Journal’s Asian Innovation Reader’s Choice Award.

In addition, China Daily has adopted Engkoo’s “hover translation” feature—hover your cursor over an English word or phrase and get an inline Chinese translation— on its China-facing website, and the People’s Daily, the largest official newspaper in China, uses the same feature in its English website.

Microsoft products, too, reap the benefits. Engkoo technology has been transferred into Bing, MSN, Office, and Windows Live Messenger. The Bing Dictionary for Chinese users has been transferred en masse to Microsoft’s Search Technology Center Asia, a partnership between Microsoft Research Asia and Bing.

“Web-based, computer-assisted language-learning approaches to English present a promising opportunity to reduce the growth of Chinglish by offering learners and teachers alike low-cost access to genuine, relevant English material,” Scott concludes. “By leveraging web data that are inherently contextual, fresh, and vast, a specially adapted search engine can make this knowledge easily accessible.”