Faculty Summit 2017: The Edge of AI

Faculty Summit 2017: The Edge of AI


The 18th annual Microsoft Research Faculty Summit in Redmond, WA on July 17 and 18, 2017 will consist of a variety of keynotes, talks, panels, and technologies focused on Artificial Intelligence (AI) research: The Edge of AI.

Microsoft AI researchers are striving to create intelligent machines that complement human reasoning and enrich human experiences and capabilities. At the core, is the ability to  harness the explosion of digital data and computational power with advanced algorithms that extend the ability for machines to learn, reason, sense and understand—enabling collaborative and natural interactions between machines and humans.

We are seeing widespread investments in AI which are advancing the state of the art in machine intelligence and perception, enabling computers to interpret what they see, to communicate in natural language, to answer complex questions, and to interact with their environment. In addition to technological advances, researchers and thought leaders need to be concerned with the ethics and societal impact of intelligent technologies.

The Microsoft Research Faculty Summit 2017 will bring together thought leaders and researchers from a broad range of disciplines including computer science, social sciences, human design and interactions, and policy. Together we will highlight some of the key challenges posed by artificial intelligence, and will identify the next generation of approaches, techniques, and tools that will be needed to develop AI to solve the world’s most pressing challenges.

Focus Areas

We will explore the following areas:

  • Machine learning – Developing and improving algorithms that help computers learn from data to create more advanced, intelligent computer systems.
  • Human language technologies – Linking language to the world through speech recognition, language modeling, language understanding, and dialog systems.
  • Perception and sensing – Creating computers and devices which understand what they see to enable tasks ranging from autonomous driving to analysis of medical images.
  • AI, people, and society – Examining the societal and individual impacts on the spread of intelligent technologies to formulate best practices for their design.
  • Systems, tools and platforms – Integrating intelligent technologies to create interactive tools such as chatbots that incorporate contextual data to augment and enrich human reasoning.
  • Integrative intelligence – Weaving together advances in AI from disciplines such as computer vision and human language technologies to create end-to-end systems that learn from data and experience.
  • Cyber-physical systems and robotics – Developing methods to ensure the integrity of drones, robots and other intelligent technologies that interact with the physical world.
  • Human AI collaboration – Harnessing research breakthroughs in artificial intelligence to design technologies that allow humans to interact with computers in novel, meaningful and productive ways.
  • Decisions and planning – Reasoning about future events to enable informed collaborations between humans and intelligent agents.

Program Chairs

Christopher M. Bishop, Distinguished Scientist and Laboratory Director
Evelyne Viegas, Director
Roy Zimmermann, Director


Sunday, July 16

Time Session Location
Registration Open
Hilton Bellevue Hotel
Welcome Reception
Hilton Bellevue Hotel

Monday, July 17

Time Session Speaker Location
Opening Remarks
Technology Showcase – Lightning Round
Machine Reading Using Neural Machines
Chair: Lucy Vanderwende, Microsoft
  • Isabelle Augenstein, University College London
  • Jianfeng Gao, Microsoft
  • Percy Liang, Stanford University
  • Rangan Majumder, Microsoft
  • Dan Bohus, Microsoft
  • Ece Kamar, Microsoft
  • Louis-Philippe Morency, Carnegie Mellon University
AI for Accessibility: Augmenting Sensory Capabilities with Intelligent Technology
Chair: Meredith Ringel Morris, Microsoft
  • Jeffrey Bigham, Carnegie Mellon University
  • Shaun Kane, University of Colorado
  • Walter Lasecki, University of Michigan
Technology Showcase
3:15–4:45 Conversational Systems in the Era of Deep Learning and Big Data
Chair: Bill Dolan, Microsoft
  • Jackie Cheung, McGill University
  • Michel Galley, Microsoft
  • Ian Lane, Carnegie Mellon University
  • Alan Ritter, Ohio State University
  • Lucy Vanderwende, Microsoft
  • Jason Williams, Microsoft
From Visual Sensing to Visual Intelligence
Chair: Gang Hua, Microsoft
  • Rama Chellappa, University of Maryland
  • Katsu Ikeuchi, Microsoft
  • Song Chun Zhu, University of California–Los Angeles
Learnings from Human Perception
Chair: Mar Gonzalez Franco, Microsoft
  • Olaf Blanke, Ecole Polytechnqiue de Lausanne
  • Mel Slater, Universidad de Barcelona
  • Ana Tajadura-Jiménez, University College London
Travel to Dinner
6:30–9:00 Dinner at The Golf Club at Newcastle

Tuesday, July 18

Time Session Speaker Location
Provable Algorithms for ML/AI Problems
Chair: Prateek Jain, Microsoft
  • Sham Kakade, University of Washington
  • Ravi Kannan, Microsoft
  • Santosh Vempala, Georgia Institute of Technology
Private AI
Chair: Ran Gilad-Bachrach, Microsoft
  • Rich Caruana, Microsoft
  • JungHee Cheon, Seoul National University
  • Kristin Lauter, Microsoft
AI for Earth
Chair: Lucas Joppa, Microsoft
  • Tanya Berger-Wolf, University of Illinois Chicago
  • Carla Gomes, Cornell University
  • Milind Tambe, University of Southern California
11:30–1:00 Microsoft Cognitive Toolkit (CNTK) for Deep Learning
Chair: Chris Basoglu, Microsoft
  • Sayan Pathak, Microsoft
  • Yanmin Qian, Shanghai Jiaotong University
  • Cha Zhang, Microsoft
AI and Security
Chair: David Molnar, Microsoft
Panel: Emotionally Intelligent AI and Agents
Moderator: Mary Czerwinski, Microsoft
  • Justine Cassell, Carnegie Mellon University
  • Jonathan Gratch, University of Southern California
  • Daniel McDuff, Microsoft
  • Louis-Philippe Morency, Carnegie Mellon University
Transforming Machine Learning and Optimization through Quantum Computing
Chair: Krysta Svore, Microsoft
  • Helmut Katzgraber, Texas A&M
  • Matthias Troyer, Microsoft
  • Nathan Wiebe, Microsoft
Challenges and Opportunities in Human-Machine Partnership
Chair: Ece Kamar, Microsoft
  • Barbara Grosz, Harvard University
  • Milind Tambe, University of Southern California
Towards Socio-Culturally Aware AI
Chair: Indrani Medhi Thies, Microsoft
  • Cristian Danescu-Niculescu-Mizil, Cornell University
3:45–4:45 Keynote Kodiak
4:45–5:30 Keynote Kodiak
5:30–5:45 Closing Remarks Kodiak


Monday, July 17

Machine Reading Using Neural Machines

Speakers: Isabelle Augenstein, University College London; Jianfeng Gao, Microsoft; Rangan Majumder, Microsoft

Teaching machines to read, process and comprehend natural language documents and images is a coveted goal in modern AI. We see growing interest in machine reading comprehension (MRC) due to potential industrial applications as well as technological advances, especially in deep learning and the availability of various MRC datasets that can benchmark different MRC systems. Despite the progress, many fundamental questions remain unanswered: Is question answer (QA) the proper task to test whether a machine can read? What is the right QA dataset to evaluate the reading capability of a machine? For speech recognition, the switchboard dataset was a research goal for 20 years – why is there such a proliferation of datasets for machine reading? How important is model interpretability and how can it be measured? This session will bring together experts at the intersection of deep learning and natural language processing to explore these topics.


Speakers: Dan Bohus, Microsoft; Ece Kamar, Microsoft; Louis-Philippe Morency, Carnegie Mellon University

Over the last decade, algorithmic developments coupled with increased computation and data resources have led to advances in well-defined verticals of AI such as vision, speech recognition, natural language processing, and dialog technologies. However, the science of engineering larger, integrated systems that are efficient, robust, transparent, and maintainable is still very much in its infancy. Efforts to develop end-to-end intelligent systems that encapsulate multiple competencies and act in the open world have brought into focus new research challenges. Making progress towards this goal requires bringing together expertise from AI and systems, and this progress can be sped up with shared best practices, tools and platforms. This session will highlight opportunities and challenges for research and development for integrative AI systems. The speakers will address various aspects of integrative AI systems, from multimodal learning and troubleshooting to development through shared platforms.

AI for Accessibility: Augmenting Sensory Capabilities with Intelligent Technology

Speakers: Jeffrey Bigham, Carnegie Mellon University; Shaun Kane, University of Colorado; Walter Lasecki, University of Michigan

Advances in AI technologies have important ramifications for the development of accessible technologies. These technologies can augment the capabilities of people with sensory disabilities, enabling new and empowering experiences. In this session, we will present examples of how breakthroughs in AI can support key tasks for diverse user populations. Examples of such applications include image labeling on behalf of people with visual impairments, fast audio captioning for people who are hard-of-hearing, and better word prediction for people who rely on communication augmentation tools to speak.

Conversational Systems in the Era of Deep Learning and Big Data

Speakers: Jackie Cheung, McGill University; Michel Galley, Microsoft; Ian Lane, Carnegie Mellon University; Alan Ritter, Ohio State University; Lucy Vanderwende, Microsoft; Jason Williams, Microsoft

Recent research in recurrent neural models, combined with the availability of massive amounts of dialog data, have together spurred the development of a new generation of conversational systems. Where past approaches focused on task-oriented dialog and relied on a pipeline of modules (e.g., language understanding, state tracking, etc.), new techniques learn end-to-end models trained exclusively on massive text transcripts of conversations. While promising, these new methods raise important questions: how can neural models go beyond chat-style dialog and interface with structured domain knowledge and programmatic APIs? How can these techniques be applied in domains where there is no existing dialog data? What new system behaviors are possible with these techniques and resources? This session will bring together experts at the intersection of deep learning and conversational systems to explore these topics through their on-going work and expectations for the future.

From Visual Sensing to Visual Intelligence

Speakers: Rama Chellappa, University of Maryland; Katsu Ikeuchi, Microsoft; Song Chun Zhu, University of California-Los Angeles

Computer vision is arguably one of the most challenging subfields of AI. To better address the key challenges, the vision research community has long been branched off from the general AI community and focused on its core problems. In recent years, we have witnessed tremendous progress in visual sensing due to big data and more powerful learning machines. However, we still lack a holistic view of how visual sensing relates to more general intelligence. This session will bring researchers together to discuss research trends in computer vision, the role of visual sensing in more integrated general intelligence systems, and how visual sensing systems will interact with other sensing modalities from a computational sense.

Learnings from Human Perception

Speakers: Olaf Blanke, Ecole Polytechnqiue de Lausanne; Mel Slater, Universidad de Barcelona; Ana Tajadura-Jiménez, University College London

Scientists have long explored the different sensory inputs to better understand how humans perceive the world and control their bodies. Many of the great discoveries about the human perceptual system were first found through laboratory experiments that stimulated inbound sensory inputs as well outbound sensory predictions. These aspects of cognitive neuroscience have important implications when building technologies, as we learn to transfer abilities that are natural to humans to leverage the strengths of machines. Machines can also be used to learn further about human perception, because technology allows scientists to reproduce impossible events and observe how humans would respond and adapt to those events. This loop from human to machine and back again can help transfer what we learn from our evolutionary intelligence to future machines and AI. This session will address progress and challenges in applying human perception to machines, and vice versa.

Tuesday, July 18

Provable Algorithms for ML/AI Problems

Speakers: Sham Kakade, University of Washington; Ravi Kannan, Microsoft; Santosh Vempala, Georgia Institute of Technology

Machine learning (ML) has demonstrated success in various domains such as web search, ads, computer vision, natural language processing (NLP), and more. These success stories have led to a big focus on democratizing ML and building robust systems that can be applied to a variety of domains, problems, and data sizes. However, due many times to poor understanding of typical ML algorithms, an expert tries a lot of hit-and-miss efforts to get the system working, thus limiting the types and applications of ML systems. Hence, designing provable and rigorous algorithms is critical to the success of such large-scale, general-purpose ML systems. The goal of this session is to bring together researchers from various communities (ML, algorithms, optimization, statistics, and more) along with researchers from more applied ML communities such as computer vision and NLP, with the intent of understanding challenges involved in designing end-to-end robust, rigorous, and predictable ML systems.

Private AI

Speakers: Rich Caruana, Microsoft; JungHee Cheon, Seoul National University; Kristin Lauter, Microsoft

As the volume of data goes up, the quality of machine learning models, predictions, and services will improve. Once models are trained, predictive cloud services can be built on them, but users who want to take advantage of the services have serious privacy concerns about exposing consumer and enterprise data—such as private health or financial data—with machine learning services running in the cloud. Recent developments in cryptography provide tools to build and enable “Private AI,” including private predictive services that do not expose user data to the model owner, and that also provide the means to train powerful models across several private datasets that can be shared only in encrypted form. This session will examine the state of the art for these tools, and discuss important directions for the future of Private AI.

AI for Earth

Speakers: Tanya Berger-Wolf, University of Illinois-Chicago; Carla Gomes, Cornell University; Milind Tambe, University of Southern California

Human society is faced with an unprecedented challenge to mitigate and adapt to changing climates, ensure resilient water supplies, sustainably feed a population of 10 billion, and stem a catastrophic loss of biodiversity. Time is too short, and resources too thin, to achieve these outcomes without the exponential power and assistance of AI. Early efforts are encouraging, but current solutions are typically one-off attempts that require significant engineering beyond what’s available from the AI research community. In this session we will explore, in collaboration with the Computational Sustainability Network (a twice-funded National Science Foundation (NSF) Expedition) the latest applications of AI research to sustainability challenges, as well as ways to streamline environmental applications of AI so they can work with traditional academic programs. The speakers in this session will set the scene on the state of the art in AI for Earth research and frame the agenda for the next generation of AI applications.

Microsoft Cognitive Toolkit (CNTK) for Deep Learning

Speakers: Sayan Pathak, Microsoft; Yanmin Qian, Shanghai Jiaotong University; Cha Zhang, Microsoft

Microsoft Cognitive Toolkit (CNTK) is a production-grade, open-source, deep-learning library. In the spirit of democratizing AI tools, CNTK embraces fully open development, is available on GitHub, and provides support for both Windows and Linux. The recent 2.0 release (currently in release candidate) packs in several enhancements—most notably Python/C++ API support, easy-to-onboard tutorials (as Python notebooks) and examples, and an easy-to-use Layers interface. These enhancements, combined with unparalleled scalability on NVIDIA hardware, were demonstrated by both NVIDIA at SuperComputing 2016 and Cray at NIPS 2016. These enhancements from the CNTK supported Microsoft in its recent breakthrough in speech recognition, reaching human parity in conversational speech. The toolkit is used in all kinds of deep learning, including image, video, speech, and text data. The speakers will discuss the current features of the toolkit’s release and its application to deep learning projects.

Panel: Emotionally Intelligent AI and Agents

Panelists: Justine Cassell, Carnegie Mellon University; Jonathan Gratch, University of Southern California; Daniel McDuff, Microsoft; Louis-Philippe Morency, Carnegie Mellon University

Emotions are fundamental to human interactions and influence memory, decision-making, and well-being. As AI systems—in particular, intelligent agents—become more advanced, there is increasing interest in applications that can fulfill task goals and social goals, and respond to emotional states. Research has shown that cognitive agents with these capabilities can increase empathy, rapport, and trust with their users. However, designing such agents is extremely complex, as most human knowledge of emotion is implicit/tacit and defined by unwritten rules. Furthermore, these rules are culturally dependent and not universal. This session will focus on research into intelligent cognitive agents. It will cover the measurement and understanding of verbal and non-verbal cues, the computational modeling of emotion, and the design of sentient virtual agents.

Transforming Machine Learning and Optimization through Quantum Computing

Speakers: Helmut Katzgraber, Texas A&M; Matthias Troyer, Microsoft; Nathan Wiebe, Microsoft

In 1982, Richard Feynman first proposed using a “quantum computer” to simulate physical systems with exponential speed over conventional computers. Quantum algorithms can solve problems in number theory, chemistry, and materials science that would otherwise take longer than the lifetime of the universe to solve on an exascale machine. Quantum computers offer new methods for machine learning, including training Boltzmann machines and perceptron models. These methods have the potential to dramatically improve upon today’s machine learning algorithms used in almost every device, from cell phones to cars. But can quantum models make it possible to probe altogether different types of questions and solutions? If so, how can we take advantage of new representations in machine learning? How will we handle large amounts of data and input/output on a quantum computer? This session will focus on both known improvements and open challenges in using quantum techniques for machine learning and optimization.

Challenges and Opportunities in Human-Machine Partnership

Speakers: Barbara Grosz, Harvard University; Milind Tambe, University of Southern California

The new wave of excitement about AI in recent years has been based on successes in perception tasks or on domains with limited and known dynamics. Because machines have achieved human parity in accuracy for image recognition and speech recognition and have beaten human champions on games such as Go and Poker, they have led to an impression of a future in which AI systems function alone. However, for more complex and open-ended tasks, current AI technologies have limitations. Future deployments of AI systems in daily life are likely to emerge from the complementary abilities of humans and machines and require close partnerships between them. The goal of this session is to highlight the potential of human-machine partnership through real-world applications. In addition, the speakers aim to identify challenges for research and development that, when solved, will build towards successful AI systems that can partner with people.


Portrait of Tanya Berger-WolfTanya Berger-Wolf
University of Illinois at Chicago


Tanya Berger-Wolf is a professor of computer science at the University of Illinois at Chicago, where she heads the Computational Population Biology Lab. As a computational ecologist, her research is at the unique intersection of computer science, wildlife biology, and social sciences. She creates computational solutions to address questions such as how environmental factors affect the behaviors of social animals (humans included). Berger-Wolf is also a cofounder of the conservation software nonprofit Wildbook, which recently enabled the first-of-its-kind complete species census of the endangered Grevy’s zebra, using photographs taken by ordinary citizens in Kenya.

Berger-Wolf holds a PhD in computer science from the University of Illinois at Urbana-Champaign. She has received numerous awards for her research and mentoring, including the US National Science Foundation CAREER Award, Association for Women in Science Chicago Innovator Award, and the UIC Mentor of the Year Award.

Portrait of Justine CassellJustine Cassell
Carnegie Mellon University


Justine Cassell is associate dean of technology strategy and impact and professor in the School of Computer Science at Carnegie Mellon University, and Director Emerita of the Human Computer Interaction Institute. She codirects the Yahoo-CMU InMind partnership on the future of personal assistants. Previously Cassell was faculty at Northwestern University where she founded the Technology and Social Behavior Center and doctoral program. Before that she was a tenured professor at the MIT Media Lab. Cassell received the MIT Edgerton Award and Anita Borg Institute Women of Vision Award, in 2011 was named to the World Economic Forum Global Agenda Council on AI and Robotics, in 2012 was named an AAAS Fellow, and in 2016 was made a Fellow of the Royal Academy of Scotland, and named an ACM Fellow. Cassell has spoken at the World Economic Forum in Davos for the past five years on topics concerning artificial intelligence and society.

Portrait of Jackie Chi Kit CheungJackie Chi Kit Cheung
McGill University


Jackie Chi Kit Cheung is an assistant professor in the School of Computer Science at McGill University, where he codirects the Reasoning and Learning Lab. He received his PhD at the University of Toronto, and was awarded a Facebook Fellowship for his doctoral research. He and his team conduct research on computational semantics and natural language generation, with the goal of developing systems that can perform complex reasoning in tasks such as event understanding and automatic summarization.

Portrait of Michel GalleyMichel Galley
Microsoft Research


Michel Galley is a researcher at Microsoft Research. His research interests are in the areas of natural language processing and machine learning, with a particular focus on dialog, machine translation, and summarization. Galley obtained his MS and PhD from Columbia University and his BS from École polytechnique fédérale de Lausanne (EPFL), all in computer science. Before joining Microsoft Research, he was a research associate in the computer science department at Stanford University. He also spent summers visiting University of Southern California’s Information Sciences Institute and the Spoken Dialog Systems group at Bell Labs. Galley served twice as area chair at top natural language processing (NLP) conferences (ACL and NAACL), and was twice best paper finalist (NAACL 2010 and EMNLP 2013).

Portrait of Jon GratchJonathan Gratch
University of Southern California


Jonathan Gratch is director for virtual human research at the University of Southern California’s (USC) Institute for Creative Technologies, a research full professor of computer science and psychology at USC, and director of USC’s Computational Emotion Group. He completed his PhD in computer science at the University of Illinois in Urbana-Champaign in 1995. Gratch’s research focuses on computational models of human cognitive and social processes, especially emotion, and explores these models’ role in shaping human-computer interactions in virtual environments. He is the founding editor-in-chief of IEEE’s Transactions on Affective Computing, associate editor of Emotion Review and the Journal of Autonomous Agents and Multiagent Systems, and former president of the Association for the Advancement of Affective Computing. He is an AAAI Fellow, a SIGART Autonomous Agent’s Award recipient, a senior member of IEEE, and member of the Academy of Management and the International Society for Research on Emotion. Gratch is the author of more than 300 technical articles.

Portrait of Barbara GroszBarbara Grosz
Harvard University


Barbara Grosz is Higgins Professor of Natural Sciences in the School of Engineering and Applied Sciences at Harvard University. She has made many contributions to the field of artificial intelligence (AI) through her pioneering research in natural language processing and in theories of multiagent collaboration and their application to human-computer interaction. She was founding dean of science and then dean of Harvard’s Radcliffe Institute for Advanced Study, and she is known for her role in the establishment and leadership of interdisciplinary institutions and for her contributions to the advancement of women in science. She currently chairs the Standing Committee for Stanford’s One Hundred Year Study on Artificial Intelligence and serves on the boards of several scientific and scholarly institutes. A member of the National Academy of Engineering and the American Philosophical Society, she is a fellow of the American Academy of Arts and Sciences, the Association for the Advancement of Artificial Intelligence, and the Association for Computing Machinery, and a corresponding fellow of the Royal Society of Edinburgh. She received the 2009 ACM/AAAI Allen Newell Award and the 2015 IJCAI Award for Research Excellence, AI’s highest honor.

Portrait of Ana Tajadura-JiménezAna Tajadura-Jiménez
Universidad Loyola Andalucía


Ana Tajadura-Jiménez studied telecommunications engineering at Universidad Politécnica de Madrid. She obtained an MSc in Digital Communications Systems and Technology and a PhD in applied acoustics at Chalmers University of Technology, Sweden. Tajadura-Jiménez was a post-doctoral researcher in the Lab of Action and Body at Royal Holloway, University of London, an ESRC Future Research Leader at University College London Interaction Centre (UCLIC), and principal investigator (PI) of the project The Hearing Body. Since 2016 Tajadura-Jiménez has been a Ramón y Cajal research fellow at Universidad Loyola Andalucía (ULA) and Honorary Research Associate at UCLIC. At ULA, she is part of the Human Neuroscience Laboratory and coordinates the research line called “Multisensory stimulation to alter the perception of body and space, emotion and motor behavior.” She is currently PI of the Project Magic Shoes. Tajadura-Jiménez’s research is empirical and multidisciplinary, combining perspectives of psychoacoustics, neuroscience, and human/computer interaction.

Portrait of LP-MorencyLouis-Philippe Morency
Carnegie Mellon University


Louis-Philippe Morency is assistant professor in the Language Technology Institute at Carnegie Mellon University where he leads the Multimodal Communication and Machine Learning Laboratory (MultiComp Lab). He was formerly research assistant professor in the Computer Sciences Department at University of Southern California (USC) and research scientist at USC Institute for Creative Technologies. Morency received his PhD and Master’s degrees from MIT Computer Science and Artificial Intelligence Laboratory. His research focuses on building the computational foundations that enable computers to analyze, recognize, and predict subtle human communicative behaviors during social interactions. In particular, Morency was lead co-investigator for the multi-institution effort that created SimSensei and MultiSense, two technologies to automatically assess nonverbal behavior indicators of psychological distress. He is currently chair of the advisory committee for ACM International Conference on Multimodal Interaction and associate editor at IEEE Transactions on Affective Computing.

Portrait of Alan RitterAlan Ritter
Ohio State University


Alan Ritter is an assistant professor in computer science at Ohio State University. His research interests include natural language processing, social media analysis, and machine learning. Ritter completed his PhD at the University of Washington and was a postdoctoral fellow in the Machine Learning Department at Carnegie Mellon University. He has received an NDSEG fellowship, a best student paper award at IUI, an NSF CRII, and has served as an area chair for ACL, EMNLP, and NAACL.

Portrait of Milind TambeMilind Tambe
University of Southern California


Milind Tambe is the Helen N. and Emmett H. Jones Professor in Engineering at University of Southern California (USC) and founding codirector of CAIS, the USC Center for AI in Society. He is a fellow of AAAI and ACM, and recipient of ACM/SIGART Autonomous Agents Research Award, Christopher Columbus Fellowship Foundation Homeland Security Award, INFORMS Wagner Prize for Excellence in Operations Research Practice, Rist Prize of the Military Operations Research Society, as well as influential paper award and multiple best paper awards at conferences such as AAMAS, IJCAI, IAAI, and IVA. Tambe’s pioneering real-world deployments of his “security games” research based on computational game theory has led him and his team to receive commendations from the US Coast Guard, the US Federal Air Marshals Service, and LA Airport Police. He has also cofounded a company based on his research, Avata Intelligence, where he serves as the director of research.

Portrait of Lucy VanderwendeLucy Vanderwende
Microsoft Research


Lucy Vanderwende’s research focuses on the acquisition and representation of semantic information, specifically the implicit meaning inferred from explicit signals, both linguistic and nonlinguistic. Vanderwende holds a PhD in computational linguistics from Georgetown University. Lucy worked at IBM Bethesda on natural language processing, and was a visiting scientist at the Institute for Systems Science in Singapore. Vanderwende was program cochair for NAACL in 2009 and general chair for NAACL in 2013. She is also affiliate associate faculty at University of Washington Department of Biomedical Health Informatics, and a member of the UW BioNLP group, which is using NLP technology to extract critical information from patient reports.

Technology Showcase

Technology Showcase

Accelerating DNNs (Deep Neural Networks) on FPGAs with Hardware Microservices

Contact: Sitaram Lanka

The BrainWave deep learning platform running on field-programmable gate array (FPGA)-based hardware microservices supports democratizing AI for all of Microsoft. Hardware microservices enable direct, ultra-low-latency access to hundreds of thousands of FPGAs from software running anywhere in the datacenter. Towards this end we will show two demos — 1) a hardware microservices implementation of a large-scale deep learning model to improve Bing query relevance; and 2) compiler and runtime support to enable developers working in either CNTK or TensorFlow to easily leverage the BrainWave platform.

AI for Earth Classification

Contact: Lucas Joppa

Understanding the land cover types and locations within specific regions enables effective environmental conservation. With sufficiently high spatial and temporal resolution, scientists and planners can identify which natural resources are at risk and the level of risk. This information helps inform decisions about how and where to focus conservation efforts. Current land cover products don’t meet these spatial and temporal requirements. Microsoft AI for Earth Program’s Land Cover Classification Project will use deep learning algorithms to deliver a scalable Azure pipeline for turning high-resolution US government images into categorized land cover data at regional and national scales. The first application of the platform will produce a land cover map for the Puget Sound watershed. This watershed is Microsoft’s own backyard and one of the nation’s most environmentally and economically complex and dynamic landscapes.

Bing Visual Search

Contact: Linjun Yang

Visual search, AKA search by image, is a new way of searching for information using an image or part of an image as the query. Similar to text search, which connects keyword queries to knowledge on the web, the ultimate goal of visual search is to connect camera captured data or images to web knowledge. Bing has been continuously improving its visual search feature, which is now available on Bing desktop, mobile, and apps, as well as Edge browser. It can be used not only for searching for similar images but also for task completion, such as looking for similar products while shopping. Bing image search now also features image annotation and object detection, to further improve the user experience. This demo will show these techniques and the scenarios for which the techniques were developed.

Custom Vision Service

Contact: Anna Roth

This demo shows how Custom Vision Service can be applied to many AI vision applications. For example, if a client needs to build a custom image classifier, they can submit a few images of objects, and a model is deployed at the touch of a button. Microsoft Office is also using Custom Vision Service to automatically caption images in PowerPoint.

Customizing Speech Recognition for Higher Accuracy Transcriptions

Contact: Olivier Nano

Two of the most important components of speech recognition systems are the acoustic model and the language model. Those models behind Microsoft’s speech recognition engine have been optimized for certain usage scenarios, such as interacting with Cortana on a smart phone, searching the web by voice, or sending text messages to a friend. But if a user has specific needs, such as recognizing domain-specific vocabulary or the ability to understand accents, then the acoustic and language models need to be customized. This demo will show the benefits of customizing acoustic and language models to improve the accuracy of speech recognition for lectures. Using the Custom Speech Service (Cognitive Service) technics, the demo will show how the technology can tune speech recognition for specific topic and lecturers.

This demo will show the benefits of customizing acoustic and language models to improve the accuracy of speech recognition for lectures. Using the Custom Speech Service (Cognitive Service) technics, the demo will show how the technology can tune speech recognition for specific topic and lecturers.

Deep Artistic Style Transfer: From Images to Videos

Contact: Gang Hua

This demo demonstrates several applications of Microsoft’s recent work in artistic style transfer for images and videos. One technology, called StyleBank, provides an explicit representation for visual styles with a feedforward deep network that can clearly separate the content and style from an image. This framework can render stylized videos online, achieving more stable rendering results than in the past. In addition, the Deep Image Analogy technique takes a pair of images, transferring the visual attributes from one to the other. It enables a wide variety of applications in artistic effects.

DeepFind: Searching within Documents to Answer Natural Language Questions

Contact: Dan Deutsch

DeepFindSearching within web documents on mobile devices is difficult and unnatural: ctrl-f searches only for exact matches, and it’s hard to see the search results. DeepFind takes a step toward solving this problem by allowing users to search within web documents using natural language queries and displays snippets from the document that answer the user’s questions.

Users can interact with DeepFind on bing.com, m.bing.com, and the Bing iOS App in two different ways: as an overlay experience, which encourages exploration and follow-up questions, or as a rich carousel of document snippets integrated directly into the search engine results pages, which proactively answers the user’s question.

InfoBots – AI Powered QnA System

Contact: Nilesh Bhide

As we move into the world of messaging apps, bots and botification of content, users are starting to move from keyword searches to relying on bots and assistants for their information seeking needs. Bing has built InfoBots, a set of AI- and Bing-powered QnA capabilities that bots can leverage to help users with their information-seeking needs. InfoBots QnA capabilities are tuned for answering any information-seeking question from a wide variety of content (Open domain content from the Internet, specific vertical domain content, etc.). InfoBots supports conversational QnA through multi-turn question and answer understanding to answer natural-language-based questions. InfoBots capabilities have applications in both consumer and enterprise contexts.

InstaFact – Bringing Knowledge to Office Apps

Contact: Silviu-Petru Cucerzan

This demo shows how InstaFact brings the information and intelligence of the Satori knowledge graph into Microsoft’s productivity software. InstaFact can automatically complete factual information in the text a user is writing or can verify the accuracy of facts in text. It can infer the user’s needs based on data correlations and simple natural-language clues. It can expose in simple ways the data and structure Satori harvests from the Web, and let users populate their text documents and spreadsheets with up-to-date information in just a couple of clicks.

Machine Reading Comprehension over Automotive Manual

Contact: Mahmoud Adada

Maluuba’s vision is to build literate machines. The research team has built deep learning models that can process written unstructured text and answer questions against it. The demo will showcase Maluuba’s machine reading comprehension (MRC) system by ingesting a 400-page automotive manual and answering users’ questions about it in real time. The long-term vision for this product is to apply MRC technology to all types of user manuals, such as cars, home appliances, and more.

Machine Teaching Using the Platform for Interactive Concept Learning (PICL)

Contact: Jina Suh

Building machine learning (ML) models is an involved process requiring ML experts, engineers, and labelers. The demand of models for common-sense tasks far exceeds the available “teachers” that can build them. We approach this problem by allowing domain experts to apply what we call Machine Teaching (MT) principles. These include mining domain knowledge, concept decomposition, ideation, debugging, and semantic data exploration.

PICL is a toolkit that originated from the MT vision. It enables teachers with no ML expertise to build classifiers and extractors. The underlying SDK enables system designers and engineers to build customized experiences for their problem domain. In PICL, teachers can bring their own dataset, search or sample items to label using active learning strategies, label these items, create or edit features, monitor model performance, and review and debug errors, all in one place.

Microsoft AI+R and Human-Robot Collaboration

Contact: David Baumert

This demonstration project, created jointly by the Microsoft Artificial Intelligence and Research (AI+R) Strategic Prototyping team and MSRA, uses Softbank’s Pepper robot as testbed hardware to show a set of human-collaboration activities based on Microsoft Cognitive Services and other Microsoft Research technologies.

As both a research and prototype-engineering effort, this project is designed to implement software technology and learn from concepts such as Brooks’ subsumption architecture, which distributes the brain activities of the robot between the local device for reflex functions, the local facility infrastructure for recognition functions, and remote API services hosted in the cloud for cognitive functions. This implementation is designed to be machine-independent and relevant to all robots requiring human-collaboration capabilities. This approach has supported new investigations such as non-verbal communication and body movements expressed and documented using Labanotation, making it possible for a robot to process conversations with humans and automatically generate life-like and meaningful physical behaviors to accompany its spoken words.

Microsoft Translator live

Contact: Chris Wendt

Microsoft Translator live enables users to hold translated conversations across two or more languages, with up to 100 participants participating at the same time using PowerPoint, iOS, Android, Windows and web endpoints. Businesses, retail stores, and organizations around the world need to interact with customers who don’t speak the same language as the service providers, and Microsoft Translator live is an answer to all these needs.

Mobile Directions Robot

Contact: Dan Bohus

This demo shows our work on a mobile robot that gives directions to visitors. Currently, this robot is navigating Microsoft Building 99, leading people, escorting and interacting with visitors and generally providing a social presence in the building. This robot uses Microsoft’s Platform for Situated Intelligence and Windows components for human interaction, as well as a robot operating system running under Linux for robot control, localization and navigation.

Project Inner Eye

Contact: Ivan Tarapov

Project Inner Eye is a new AI product targeted at improving the productivity of oncologists, radiologists and surgeons when working with radiological images. The project’s main focus is in the treatment of tumors and monitoring the progression of cancer in temporal studies. InnerEye builds upon many years of research in computer vision and machine learning. It employs decision forests (as used already in Kinect and Hololens) to help radiation oncologists and radiologists deliver better care, more efficiently and consistently to their cancer patients.

Project Malmo – Experimentation Platform for the next Generation of AI Research

Contact: Katja Hofmann

Project Malmo is an open source AI experimentation platform that supports fundamental AI research. With the platform, Microsoft provides an experimentation environment in which promising approaches can be systematically and easily compared, and that fosters collaboration between researchers. Project Malmo achieves is built on top of Minecraft, which is particularly appealing due to its design; open-ended, collaborative, and creative. Project Malmo particularly focuses on Collaborative AI – developing AI agents that can learn to collaborate with other agents, including humans, to help them achieve their goals. To foster research in this area, Microsoft recently ran the Malmo Collaborative AI Challenge, in which more than 80 teams of students worldwide, competed to develop new algorithms that facilitate collaboration. This demo demonstrates the results of the collaborative AI challenge task and shows selected agents and how new tasks and agents can be easily implemented.


Contact: Ying Wang

Zo is a sophisticated machine conversationalist with the personality of a 22-year-old with #friendgoals. She hangs out on Kik and Facebook and is always interested in a casual conversation with her growing crowd of human friends. Zo is an open-domain chatbot and her breadth of knowledge is vast. She can chime into a conversation with context-specific facts about things like celebrities, sports, or finance but she also has empathy, a sense of humor, and a healthy helping of sass. Using sentiment analysis, she can adapt her phrasing and responses based on positive or negative cues from her human counterparts. She can tell jokes, read your horoscope, challenge you to rhyming competitions, and much more. In addition to content, the phrasing of the conversations must sound natural, idiomatic, and human in both text and voice modalities. Zo’s “mind” is a sophisticated array of multiple machine learning (ML) techniques all working in sequence and in parallel to produce a unique, entertaining and, at times, amazingly human conversational experience. This demo shows some of Zo’s latest capabilities and how the team has achieved these technical accomplishments.