After studying Computer Science and Mathematics at Carnegie Mellon University, I joined Microsoft in 2000 to work on the Intentional Programming project, an extensible compiler and development framework. I moved to the Natural Language Processing group in 2001, where my research has mostly focused on statistical machine translation powering Microsoft Translator, especially on several generations of a syntax directed translation system that powers over half of the translation systems. I am also interested in semantic parsing, paraphrase methods, and very practical problems such as spelling correction and transliteration.



OPAL is a programming language and environment for designing intelligent assistants based on natural language. Tasks in OPAL can explore multiple hypothetical worlds to resolve the ambiguity in users' intent by exploring its potential implications on the real world. Weightings…

Language to Code

Our goal is to let normal users tell computers what to do using normal language. This problem space is strongly related to natural language understanding, program synthesis, and many other areas. The data release associated with the following ACL…

Data-Driven Conversation

This project aims to enable people to converse with their devices. We are trying to teach devices to engage with humans using human language in ways that appear seamless and natural to humans. Our research focuses on statistical methods by…

NLPwin parses AMR

The Logical Form analysis produced by the NLPwin parser is very close in spirit to the level of semantic representation defined in AMR, Abstract Meaning Representation. The "NLPwin parses AMR" project is a conversion from LF to AMR in order…

Recurrent Neural Networks for Language Processing

This project focuses on advancing the state-of-the-art in language processing with recurrent neural networks. We are currently applying these to language modeling, machine translation, speech recognition, language understanding and meaning representation. A special interest in is adding side-channels of information…


Statistical Parsing and Linguistic Analysis Toolkit is a linguistic analysis toolkit. Its main goal is to allow easy access to the linguistic analysis tools produced by the Natural Language Processing group at Microsoft Research. The tools include both traditional linguistic…


I mostly work with folks in the NLP and Machine Translation groups.

I’ve had the privilege of mentoring or working with a number of interns over the years, including Katharina Probst, Colin Cherry, Pavel Pecina, Ethan Phelps-Goodman, Vivek Srikumar, Arne Mauser, Hao Zhang, Jason Smith, Mohit Bansal, Arianna Bisazza, Mayank Srivastava, Jenny Lin, Joern Wuebker, Juri Ganitkevich, Hui Zhang, and Wei Deng.

Reviewer for ACL, EMNLP, COLING, MT Summit consistently over the past 5+ years. Area Chair in MT: ACL 2009, EMNLP 2009, 2012