The situated interaction research effort aims to enable computers to reason more deeply about their surroundings, and engage in fluid interaction with humans in physically situated settings. When people interact with each other, they engage in a rich, highly coordinated, mixed-initiative process, regulated through both verbal and non-verbal channels. In contrast, while their perceptual abilities are improving, computers are still unaware of their physical surroundings and of the “physics” of human interaction. Current human-computer interaction…
My work centers on the study and development of computational models for physically situated spoken language interaction and collaboration. The long term question that shapes my research agenda is how can we enable interactive systems to reason more deeply about their surroundings and seamlessly participate in open-world, multiparty dialog and collaboration with people?
Physically situated interaction hinges critically on the ability to reason about and model processes like conversational engagement, turn-taking, grounding, interaction planning and action coordination. Creating robust solutions that operate in the real-world brings to the fore broader AI challenges. Example questions include issues of representation (e.g. what are useful formalisms for creating actionable, robust models for multiparty interaction), machine learning methods for multimodal inference from streaming sensory data, predictive modeling, decision making and planning under uncertainty and temporal constraints, etc.
Intelligent agents that can handle human language play a growing role in personalized, ubiquitous computing and the everyday use of devices. Agents need to be able to communicate and collaborate with humans in ways that are seamless and natural, and to be able to learn new behaviors, concepts, and relationships as first-class operations. In other words, our devices need to be able to converse with us. In this project, Microsoft Research AI teams are interested…
Conversational systems interact with people through language to assist, enable, or entertain. Research at Microsoft spans dialogs that use language exclusively, or in conjunctions with additional modalities like gesture; where language is spoken or in text; and in a variety of settings, such as conversational systems in apps or devices, and situated interactions in the real world. Projects Spoken Language Understanding
Established: December 15, 2008
In the Voice Search project, we envision a future where you can ask your cellphone for any kind of information and get it. With a small cellphone, there is a heavy tax on traditional keyboard based information entry, and we believe it can be significantly more convenient to communicate by voice. Our work focuses on making this communication more reliable, and able to cover the full range of information needed in daily life.
Incremental Coordination: Attention-Centric Speech Production in a Physically Situated Conversational AgentZhou Yu, Dan Bohus, Eric Horvitz, in SIGDIAL Conference, July 21, 2015,
Learning N-Best Correction Models from Implicit User Feedback in a Multi-Modal Local Search ApplicationDan Bohus, Xiao Li, Patrick Nguyen, Geoffrey Zweig, in Special Interest Group on Discourse and Dialogue (SIGdial), June 1, 2008,
Implicitly-Supervised Learning In Spoken Language Interfaces: An Application To The Confidence Annotation ProblemDan Bohus, Alexander Rudnicky, in Proceedings of SIGdial 2007, Antwerp, Belgium, September 1, 2007,
Initial Development of a Voice-Activated Astronaut Assistant for Procedural Tasks: From Need to Concept to PrototypeGregory Aist, Dan Bohus, Brad Boven, Ellen Campana, Susana Early, Steven Phan, Learning Technology Institute, January 1, 2004,
Integrating Multiple Knowledge Sources for Utterance-Level Confidence Annotation in the CMU Communicator Spoken Dialog SystemDan Bohus, Alexander I. Rudnicky, in Technical Report CS-190, Carnegie Mellon University, Pittsburgh, PA, November 1, 2002,
April 8, 2014