By Gary Alt, Writer, Microsoft
Imagine mining the web to learn a language. No, not the jargon of webspeak, where IMHO means “in my humble opinion” or F2F is “face to face,” but real, spoken languages, such as Spanish, Hindi, or Japanese. That’s the notion that intrigued Ming Zhou, Matt Scott, and their colleagues at Microsoft Research Asia as they studied how the web’s zillions of words, in scores of languages, could be utilized for exploring and learning new tongues.
The resulting application, which they’ve named Engkoo—which loosely translates to “English vault” in Mandarin Chinese—is a groundbreaking piece of software that takes advantage of natural-language-processing and speech technologies to build massive sets of bilingual terms and sentences. Engkoo currently helps Chinese speakers who are learning English, but the technology could be applied to any two languages that are widespread on the web.
“Engkoo is a new kind of language-assistance technology for Chinese people, to enable them to ultimately master English as a native speaker might,” says Scott, development lead of the Innovation Engineering group and Engkoo project manager. “It unifies human translation mined from the web, machine translation, and a language-learning experience into one user-friendly search-and-explore interface.
“By continuously discovering and analyzing high-quality translations on the Internet, Engkoo can be used to close the ever-expanding translation gap between English and Chinese. The technology itself is language independent and can be extended to other language pairs in the future.”
Adds Zhou, senior researcher and research manager of the Natural Language Computing group:
“Engkoo aims to improve the quality of English learning and translation in China. New words are added to both Chinese and English every day, while other words change in meaning and usage. Traditional translation dictionaries can’t keep up and don’t always provide results that reflect current common usage.
“Additionally, Engkoo addresses other challenges, such as the difficulty of finding fluent sources of English learning material in schools, the necessity of assisting information workers who increasingly need to communicate globally, and combating the proliferation of the ’Chinglish’ found on many public signs and billboards around China.” The latter are poor Chinese-to-English translations that sometimes amuse, but more often confuse, English-speaking visitors.
“Engkoo is not designed primarily for learning a new language from the ground up,” Zhou says. “Rather, it’s designed to be an asset to those who already use or study English, such as English-as-a-second-language (ESL) students.”
So how exactly does Engkoo work? Zhou explains.
“Engkoo is the synthesis of multiple research technologies,” he says. “The primary technology is that of mining human translation knowledge from the web. This helps power other features. The mining works by scanning the web to find parallel Chinese and English content from separate web pages or within the same page, such as from news websites that may publish an article in both Chinese and English. Using multiple adaptive-pattern techniques, the system can extract out language-aligned sentences and words, thus powering Engkoo’s sample-sentence and term-definition features. The system uses multiple statistical methods to filter out noisy data and rank the sentences and definitions—similar, in some ways, to how a search engine works.”
The result is an enormous lexicon of bilingual terms and sentence paradigms. Engkoo then layers in information from existing dictionaries and reference sources, and, voilà, you have the world’s largest lexicon linking Chinese and English.
Engkoo’s ability to mine the vastness of the web provides it with powerful capabilities, going far beyond crude translations. It analyzes the parallel Chinese and English websites and then ranks the quality of the translation, thus building an ever-expanding repository of terms and sentences, ranked by the reliability and elegance of the translation.
When a user enters words or sentences into the Engkoo search box, the software combs through its ranked data set to find the best translation. This works in both directions; the search terms can be in Chinese or English. What’s more, Engkoo provides sample sentences that show how the translated words and phrases are used, thus helping the learner grasp the nuances of the foreign language.
How is this different from the many online translation applications already on the web? That’s like asking how calculus is different from arithmetic.
“Engkoo is different because it seamlessly unifies dictionary, machine translation, and language learning into an easy-to-use interface,” Scott says, “providing the user with more context, fresher results, more robust ways to avoid near-miss queries, and new ways to explore language. It excels relative to other services that specialize in a dictionary, machine translation, and language learning.”
The size and source of Engkoo’s lexicon is massive, comprising more than 10 million terms and sample sentences, at least twice the estimated lexicon size of the largest competitor in the Chinese market. But it’s the source of this massive lexicon that really sets Engkoo apart. Because it uses novel web-mining technology to extract high-quality human translation knowledge from the Web, it’s essentially creating a dynamic dictionary from translated news transcriptions and other Internet content. What makes this useful is that it’s “real” English—relevant and endlessly expanding.
Achieving state-of-the-art quality in machine translation (MT) required a collaborative effort across Microsoft groups and partners in academia. The key Microsoft partnership was between the Natural Language Processing group at Microsoft Research Redmond, led by Bill Dolan, and the Natural Language Computing group. The groups have worked together on many fruitful MT collaborations. For instance, in 2008, the groups, along with other research partners, received the top Chinese-English MT quality ranking in the National Institute of Standards and Technology’s prestigious Open MT evaluation series. Building on those research results, which were incorporated into Engkoo, the groups also worked together on other initiatives, including the Chinese-English MT engine now used in Bing Translator.
But it’s as a language-learning service that Engkoo shines, exposing novel, useful learning features not found in any competing product. For example, studies have shown that English learners in China find it difficult to compare two similar English words, such as “taught” and “instructed.” Engkoo addressed this challenge with an innovation that harnesses research from the area of human-computer interaction. The resulting comparison tool enables users to search for a word and then, within a tabbed window environment, search for similar words. Each word appears as its own tab, which can be dragged and dropped for side-by-side comparisons. That way, users can compare sets of similar words, complete with definitions and sample sentences. This comparison tool has been hailed in the Chinese press and online community.
Engkoo also provides the ability to explore example sentences by categorizing them by difficulty or domain. Users can learn at their own rate by selecting easy, medium, or difficult English, and they can choose English from domains such as written, oral, or technical. These classifications were performed through a novel machine-learning technique and applied on a massive scale.
Relative to other language-learning tools, Engkoo offers a unique, phonetic-based “fuzzy” search adapted to local pronunciation habits of mainland Chinese. By studying users, the Engkoo team discovered that Chinese ESL learners often search for words as they sound, such as those they heard from foreign colleagues or from music or television. So, for example, a Chinese user might search for ”shampin,” which mainland Chinese speakers commonly say when their intent is ”champagne.” Such behavior reveals a major limitation of other language-learning services, because many of those words cannot be found—hence the learning process stops abruptly. Engkoo, by contrast, maps such words, enabling the user not only to find them, but also to learn the correct English spelling.
The ability to observe the alignment of translated words or phrases in bilingual sample sentences is yet another groundbreaking tool in Engkoo. As the user mouses over either the Chinese sentence or the English translation, the corresponding words are highlighted in both. The alignment information not only clearly exposes the structural differences of translated sentence pairs, but also provides instant translations.
Engkoo also provides the ability to learn and explore native English by statistically finding nearby words, or “collocations,” a task that would be difficult—if not impossible—to discover on one’s own without reading an enormous amount of English text. This system works because of a novel technique of leveraging part-of-speech wild cards. For example, users can find prepositions that typically follow the word “terrific” by simply searching for “terrific prep.” In this example, they could find sentences such as “I think it looks terrific on you.” These sentences are statistically significant because they are derived from web-scale language knowledge.
Finally, there is the text-to-speech (TTS) feature in Engkoo that can convert input text into natural-sounding speech. This has proved one of users’ favorite features, Scott notes. The state-of-the-art TTS technology has been developed and refined continuously by researchers in the Speech Group at Microsoft Research Asia. The TTS used in Engkoo recently was rated as the best in intelligibility in the international TTS contest, Blizzard Challenge 2010, in both English and Chinese.
The text-to-speech technology is based on a sophisticated, statistically trained model that succinctly captures the phonetic and rhythmic characteristics of native English speech. The model is then capable of synthesizing the sound evolution, the ups and downs of intonation change, the stressed or unstressed points of any given sentence. Besides being phonetically accurate, the spoken rhythm of the synthesized sentences is close to that of a native English speaker. This is invaluable to ESL learners, because the spoken rhythm is extremely difficult for a non-native English speaker to produce.
Pronunciation and Intonation
“Engkoo’s speech synthesis has good pronunciation, and the intonation is not bad,” says Frank Soong, principal researcher and research manager of the Speech Group. “Our English text-to-speech actually speaks better English than most of the English teachers in China.”
The TTS user interface is designed to facilitate easy playback and downloading to a user’s MP3 player for later listening and practicing. There are more than a million MP3 downloads per month from the Engkoo website.
Like most research developments, Engkoo, Zhou explains, rests on a strong foundation of past work.
“The project began over a decade ago,” he says. “The impetus for this research is one of the quintessential quests of the fields of natural language processing and computational linguistics: for computers to effectively assist people in understanding and using a foreign language.”
The group’s first project was the English Writing Wizard, a program designed for ESL learners, which featured a manually compiled lexicon. This evolved into English Writing Assistant, a feature that shipped in Office 2003. The next step toward Engkoo took place during Microsoft Research TechFests in 2007 and 2008, with the Natural Language Computing group generating buzz around Lingo, a demo prototype of a language-learning tool that had evolved from the English Writing Assistant. That’s when the Innovation Engineering group took over.
Working with experts in the fields of speech processing, web-data management, and human-computer interaction, they refined Lingo to meet the needs of language learners, eventually launching engkoo.com in 2009. The team continued to develop Engkoo’s user scenarios and underlying technology in a process the call “deployment-driven research,” and by early 2010, it was ready to release the feature broadly, shipping it in China as a part of Bing, where it is now called Bing Cidian, “cidian” translating to “dictionary” in English. This wide release brought millions of new users and established Engkoo as a highly popular product in the Asian market. Engkoo recently was named a finalist for The Wall Street Journal’s 2010 Asian Innovation Awards.
Building on such success, the researchers plan to apply the Engkoo technology to new language pairs, such as Japanese and English. In the meantime, it’s great to have two of the world’s most widely spoken languages linked together in such a powerful learning tool.