We demonstrate this using a treasure hunt game which guides the user along a previously authored path indoors or outdoors using geo-located arrows or floating 3D bubbles. Applications include games, city tours, and self-localization for mobile robotics. At its core, this is based on a technology for rapidly extracting distinctive features in images and matching them into a database of locations. This has already formed the backbone of released products such as Live Labs Photosynth and Microsoft Image Composite Editor.
Our current technology extracts “interest points” and “invariant descriptors” from images to provide characteristic information about a visual scene and therefore allows matching from one scene to another. We used this to automatically stitch together many photographs in Microsoft Image Composite Editor, a tool for assembling panoramas. We also used it together with 3D reconstruction methods to create 3D scenes from collections of photographs in Live Labs Photosynth. Now that hand-held devices have video cameras and powerful processors we are developing real time solutions that are able to continuously match what is seen by the camera of a mobile device against a database of known views of locations, obtained for example from Windows Live Street-side imagery. This provides localization for the device that adds significant detail to any information gained from GPS.
Augmented reality is the process of adding computer graphics elements to images of real scenes to provide all kinds of information to the user. Since we are able to track the moment-by-moment mapping between the camera view and stored views, we can add any location-related data to the user’s screen and make it look like it is part of the world. It is possible to imagine a virtual tour guide who points out the details of the city as we walk around. This is just one of many applications.