I came to Microsoft Research in March 1998, first as a Researcher in the speech technology group working on the areas of spoken language understanding and dialog modeling. I contributed to the project MiPad and created the Speech Application Language Tags (SALT) that is now part of the international standards ISO/IEC 18051/ECMA-269/ETSI TS 102 173, ISO/IEC 18056/ECMA-323/ETSI TS 101 990, and ECMA-348/ISO IEC 18450. An object model version, described in this TR I wrote, has entered its final phase of being standardized. I also contributed to the world wide web consortium (W3C) Speech Recognition Grammar Specification (SRGS), W3C Speech Synthesis Markup Language (SSML), and various other publications from W3C Multimodal Interaction Working Group. Many of my research papers can still be found at the speech group's publication list and video demo area.

In January 2004, I moved to the speech product group and became a software architect. There I helped create and ship the product Microsoft Speech Server, which is still powering the corporate call center for Microsoft. If you calling into Microsoft's main number, you will be greeted by my automated operator, MS Connect. In this capacity, I also managed the revision of the speech system used in the Microsoft Voice Command, an add-on to Windows Mobile smart phone that allows users to operate their smart phones with voice in an eyes-busy, hands-busy environment. Many of the technologies are still in use in Cortana, a virtual personal assistant from Microsoft.

I was a founding member of an incubation group inside Microsoft that shipped Microsoft Response Point, a speech-enabled small business phone system that uses voice over Internet Protocol (VoIP) technologies. Because the incubation group was structured to run like a start-up inside Microsoft, I had the opportunity to be the acting development manager and later the testing manager to build the engineering team from ground up. In addition to the speech capabilities, I was also responsible for ensuring the product is easy to setup and easy to use, including the invention of the magic "Response Point button" that earns Microsoft revenue on every phone sold without even having Microsoft software on it! I am especially glad that these and other innovations of the product have received awards and customer feedback.

Since September 2007, I have been back in Microsoft Research (MSR), joining the newly founded Internet Service Research Center with a mission to revolutionize online services and make Web more intelligent. I have been teaching the machine to read the massive web contents to extract the knowledge, to understand users' interests and anticipate their needs, and to serve and alert the web knowledge to users in a helpful way, including engaging in a natural conversation or multimodal dialog. The first application, on changing the way web search works in Bing, was first announced at MSR Faculty Summit in July 2010. It is exhilarating to see that, since that public disclosure, major web search companies, such as Google (in 2012) and Baidu (in 2014), have also introduced similar services into their products. To ensure the research community can verify, replicate and advance our results, components and data sets underlying my research work have been made available through Microsoft Cognitive Services, ranging from the web scale Markov N-gram to Knowledge Exploration Service. In March 2016, I have taken on an additional role as a Managing Director of MSR Outreach, an organization with the mission to serve the research community. In addition to applying the intelligent technologies to make Bing and Cortana smarter in gathering and serving academic knowledge, we are also starting an experimental website, academic.microsoft.com (powered by Academic API), and mobile apps dedicated to exploring new service scenarios for active researchers like myself. Please use the feedback mechanism to let us know what you think!

Before joining Microsoft, I worked at Bell Labs from 1994 to 1996, and the NYNEX (now part of Verizon) Science and Technology Center. I received my M.S. and Ph.D from the University of Maryland in 1989 and 1994, and my B.S. from National Taiwan University in 1986, all in Electrical Engineering.


Microsoft Academic Graph

Established: June 5, 2015

The Microsoft Academic Graph is a heterogeneous graph containing scientific publication records, citation relationships between those publications, as well as authors, institutions, journals, conferences, and fields of study. This graph is used to power experiences in Bing, Cortana, Word, and in Microsoft Academic. Access the Microsoft Academic Graph The Microsoft Academic Graph can be accessed via the Microsoft Cognitive Services Academic Knowledge API. The graph is currently being updated on a weekly basis. Azure resources are…


Established: March 11, 2014

A Large-Scale Real-World Image Dataset We argue that the massive amount of click data from commercial search engines provides a data set that is unique in the bridging of the semantic and intent gap. Search engines generate millions of click data (a.k.a. image-query pairs), which provide almost "unlimited" yet strong connections between semantics and images, as well as connections between users' intents and queries. This site is to introduce such as dataset, Clickture. The dataset,…

Multimodal Conversational User Interface

Established: January 29, 2004

Researchers in the Speech Technology group at Microsoft are working to allow the computer to travel through our living spaces as a handy electronic HAL pal that answers questions, arrange our calendars, and send messages to our friends and family. Most of us use computers to create text, understand numbers, view images, and send messages. There's only one problem with this marvelous machine. Our computer lives on a desktop, and though we command it with…

Speech Enabled Language Tags (SALT)

Established: January 29, 2004

SALT is an XML based API that brings speech interactions to the Web. Starting as a research project that aims at applying the Web interaction model for spoken dialog, SALT has evolved into an industry standard with more than 70 companies and universities participating in advancing the standard. The design philosophy and brief overview of SALT is published here. Microsoft has announced many SALT based products, including SALT aware desktop IE and pocket IE add-ins.…


Established: February 19, 2002

Your Pad or MiPad It only took one scientist mumbling at a monitor to give birth to the idea that a computer should be able to listen, understand, and even talk back. But years of effort haven't gotten us closer to the Jetson dream: a computer that listens better than your spouse, better than your boss, and even better than your dog Spot. Using state-of-the-art speech recognition, and strengthening this new science with pen input,…















MIPAD: A Multimodal Interactive Prototype
Xuedong Huang, Alex Acero, C. Chelba, Li Deng, Jasha Droppo, D. Duchene, J. Goodman, Hsiao-Wuen Hon, D. Jacoby, L. Jiang, Ricky Loynd, Milind Mahajan, P. Mau, S. Meredith, S. Mughal, S. Neto, M. Plumpe, K. Steury, Gina Venolia, Kuansan Wang, Ye-Yi Wang, in International Conference on Acoustics, Speech, and Signal Processing, Institute of Electrical and Electronics Engineers, Inc., January 1, 2001, View abstract, Download PDF


MiPad: A Next Generation PDA Prototype
Xuedong Huang, Alex Acero, C. Chelba, Li Deng, Doug Duchene, J. Goodman, Hsiao-Wuen Hon, D. Jacoby, Li Jiang, Ricky Loynd, Milind Mahajan, P. Mau, S. Meredith, Salman Mughal, S. Neto, M. Plumpe, Kuansan Wang, Ye-Yi Wang, in International Conference on Spoken Language Processing, International Speech Communication Association, January 1, 2000, View abstract, Download PDF