Vision-and-Dialog Navigation

Dialog-enabled smart assistants, which communicate via natural language and occupy human homes, have seen widespread adoption in recent years. These systems can communicate information, but do not manipulate objects or move themselves. By contrast, manipulation-capable and mobile robots are still largely deployed in industrial settings, but do not interact with human users. Dialog-enabled robots can bridge this gap, with natural language interfaces helping robots and non-experts collaborate to achieve their goals. In particular, navigation in unseen or dynamic environments to high-level goals (e.g., “Go to the room with a plant”) can be facilitated by enabling navigation agents to ask questions in language, and to react to human clarifications on-the-fly. To study this challenge, we introduce Cooperative Vision-and-Dialog Navigation, an English language dataset situated in the Matterport Room-2-Room simulation environment.

[Slides]

Date:
Speakers:
Jesse Thomason
Affiliation:
University of Washington

Series: Microsoft Research Talks