Portrait of Lucy Vanderwende

Lucy Vanderwende

Senior Researcher


Lucy holds a Ph.D. in Computational Linguistics from Georgetown University, in Washington D.C. Lucy worked at IBM Bethesda on natural language processing from 1988 – 1990. In 1991, she was a Visiting Scientist at the Institute for Systems Science in Singapore.  Lucy has worked at Microsoft Research since 1992.  Lucy was Program Co-Chair for NAACL in 2009 and General Chair for NAACL in 2013. Since 2011, Lucy is also Affiliate Associate Faculty at University of Washington Department of Biomedical Health Informatics, a member of the UW BioNLP group, who are using NLP technology to extract critical information from patient reports.

Research Interests

MindNet: automated acquisition of semantic knowledge
Summarization, focusing on summary generation and evaluation
Making reading more effective
Question Generation
Computer-Assisted Grading
NLPwin, robust, broad-coverage language analysis at Microsoft
NLP and Healthcare



Biomedical Natural Language Processing

The biomedical sciences are beginning to undergo a major transformation. Precision medicine has the potential to make treatments much more effective by better understanding patients, biological mechanisms, and therapeutic effects. However, current approaches only reach a small fraction of the patient population.  Consider the molecular tumor board: dozens of highly paid specialists create a custom treatment plan for an individual patient, combing the research literature for research advances that are relevant to the cancer of…

Conversational Intelligence

Intelligent agents that can handle human language play a growing role in personalized, ubiquitous computing and the everyday use of devices. Agents need to be able to communicate and collaborate with humans in ways that are seamless and natural, and to be able to learn new behaviors, concepts, and relationships as first-class operations. In other words, our devices need to be able to converse with us. In this project, Microsoft Research AI teams are interested…

NLPwin parses AMR

Established: March 17, 2015

The Logical Form analysis produced by the NLPwin parser is very close in spirit to the level of semantic representation defined in AMR, Abstract Meaning Representation. The "NLPwin parses AMR" project is a conversion from LF to AMR in order to facilitate 1) evaluation of the NLPwin LF and 2) contribution the ongoing discussion of the specification of AMR. In this project, we include publications, as well as links to our LF training data converted…


Established: October 3, 2014

An introduction by Lucy Vanderwende* * on behalf of everyone who contributed to the development of NLPwin NLPwin is a software project at Microsoft Research that aims to provide Natural Language Processing tools for Windows (hence, NLPwin). The project was started in 1991, just as Microsoft inaugurated the Microsoft Research group; while active development of NLPwin continued through 2002, it is still being updated regularly, primarily in service of Machine Translation. NLPwin was and is still…

Data-Driven Conversation

Established: June 1, 2014

This project aims to enable people to converse with their devices. We are trying to teach devices to engage with humans using human language in ways that appear seamless and natural to humans. Our research focuses on statistical methods by which devices can learn from human-human conversational interactions and can situate responses in the verbal context and in physical or virtual environments. Natural and Engaging Agents that process human language will play a growing role…


Established: April 4, 2012

Statistical Parsing and Linguistic Analysis Toolkit is a linguistic analysis toolkit. Its main goal is to allow easy access to the linguistic analysis tools produced by the Natural Language Processing group at Microsoft Research. The tools include both traditional linguistic analysis tools such as part-of-speech taggers and parsers, and more recent developments, such as sentiment analysis (identifying whether a particular of text has positive or negative sentiment towards its focus) Demo URL: You can find…

Microsoft Research ESL Assistant

Established: May 9, 2008

The Microsoft Research ESL Assistant is a web service that provides correction suggestions for typical ESL (English as a Second Language) errors. Such errors include, for example, the choice of determiners (the/a) and the choice of prepositions. The web service also provides word choice suggestions from a thesaurus. In order to help the user make decisions on whether to accept a suggestion, the service displays "before and after" web search results so that the user…


Established: December 19, 2001

Overview MindNet is a knowledge representation project that uses our broad-coverage parser to build semantic networks from dictionaries, encyclopedias, and free text. MindNets are produced by a fully automatic process that takes the input text, sentence-breaks it, parses each sentence to build a semantic dependency graph (Logical Form), aggregates these individual graphs into a single large graph, and then assigns probabilistic weights to subgraphs based on their frequency in the corpus as a whole. The…




Visual Storytelling
Ting-Hao (Kenneth) Huang, Francis Ferraro, Nasrin Mostafazadeh, Ishan Misra, Aishwarya Agrawal, Jacob Devlin, Ross Girshick, Xiaodong He, Pushmeet Kohli, Dhruv Batra, C. Lawrence Zitnick, Devi Parikh, Lucy Vanderwende, Michel Galley, Margaret Mitchell, in Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL HLT), 2016, June 13, 2016, View abstract, Download PDF, View external link



Mining Text Snippets for Images on the Web
Anitha Kannan, Simon Baker, Krishnan Ramnath, Juliet Fiss, Dahua Lin, Lucy Vanderwende, Rizwan Ansary, Ashish Kapoor, Qifa Ke, Matt Uyttendaele, Xin-Jing Wang, Lei Zhang, in KDD '14 Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining, ACM Press, August 24, 2014, View abstract, Download PDF


Probabilistic Frame Induction
Jackie Chi Kit Cheung, Hoifung Poon, Lucy Vanderwende, in Proceedings of the Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL), 2013., ACL/SIGPARSE, January 1, 2013, View abstract, Download PDF


















Machine Reading Using Neural Machines Link description

Machine Reading Using Neural Machines


July 17, 2017


Lucy Vanderwende; Percy Liang; Jianfeng Gao; Rangan Majumder; Isabelle Augenstein


Microsoft; Stanford University; Microsoft; Microsoft; University College London

Link description

Faculty Summit 2016: Hot Topics


July 13, 2016


Lucy Vanderwende, Ran Gilad-Bachrach, Bryan Parno and Nikhil Swamy, Neil Dalchau


Micrososft Research

Link description

Commonsense and World Knowledge


July 24, 2015


Benjamin Van Durme, Dan Roth, Lucy Vanderwende, Margaret Mitchell, and Raymond J. Mooney


Microsoft Research, University of Texas at Austin, University of Illinois at Urbana-Champaign, Johns Hopkins University

Link description

UW/MS symposium


June 6, 2008


Danyel Fisher, Douglas Downey, Chris Quirk, Scott Drellishak, Kelly O'Hara, Emily M. Bender, Sumit Basu, Matthew Hurst, Arnd Christian König, Michael Gamon, Chris Brockett, Dmitriy Belenko, Bill Dolan, Jianfeng Gao, and Lucy Vanderwende


Visual Question Generation dataset

October 2016

We introduce this dataset in order to support the novel task of Visual Question Generation (VQG), where, given an image, the system should ‘ask a natural and engaging question’. This dataset can be used to support research on common sense reasoning and compute-human conversational systems.

    Click the icon to access this download

  • Website


Lucy’s research focuses on text understanding. She is deeply involved with developing MindNet, a method for automatically acquiring semantic information. All types of semantic information can be identified in and extracted from text. Dictionaries can provide the semantic information, for example, that a sheep is an animal; encyclopedias provide specific knowledge, for example, that Armstrong landed on the moon. Specialized data sets provide information on a given topic, for example, that Microsoft Research was founded in 1991. Common sense information can also be extracted from web-scale resources. Such information can be extracted in a variety of ways, from rule-based to completely unsupervised.

Lucy’s focus is to work with applications that demonstrate how the information in a knowledge resource can be used to improve human understanding and productivity.  In particular, she has been involved in several projects in Healthcare that are aimed at understanding and structuring the information contained in unstructured text such as a patient’s clinical records (e.g., for phenotype prediction) or biomedical scientific publications. Understanding the author’s commitment to the reliability of the statement (sometimes called, assertion detection) is key to providing a robust understanding of the text.

Lucy is also excited to be working on ways to make reading more effective. One avenue is to support a reader’s mastery of the text by using Question Generation to create quizzes for arbitrary selections of text. With such quizzes, the reader can see for themselves which part(s) of the text they know and which they should re-read.  The value of open-response questions to support learning is well-known. She is also working on enabling teachers to pose open-response questions by creating a workflow called Powergrading, where the teacher grades clusters of answers simultaneously, identifies answers that don’t belong in the cluster, and provides rich feedback while gaining insight into how well the students are doing in class.