Augmented reality (AR) takes natural user input (NUI), such as
gestures, voice, and eye gaze, and produces digital visual overlays on top
of reality seen by a user. Today, multiple shipping AR applications
exist, most notably titles for the Microsoft Kinect and smartphone
applications such as Layar, Wikitude, and Junaio. Despite this
activity, little attention has been paid to operating system support
for AR applications. Instead, each AR application today does its own
sensing and rendering, with the help of user-level libraries like
OpenCV or the Microsoft Kinect SDK.

In this paper, we explore how operating systems should evolve to
support AR applications. Because AR applications work with
fundamentally new inputs and outputs, an OS that supports AR
applications needs to re-think the input and display abstractions
exposed to applications. Unlike mouse and keyboard, which form
explicit, separate channels for user input, NUI requires continuous
sensing of the real-world environment, which often has sensitive data
mixed with user input. Hence, the OS input abstractions must ensure
that user privacy is not violated, and the OS must provide a
fine-grained permission system for access to recognized objects like a
user’s face and skeleton. In addition, because visual outputs of AR
applications mix real-world and virtual objects, the synthetic window
abstraction in traditional GUIs is no longer viable, and OSes must
rethink the display abstractions and their management. We discuss
research directions for solving these and other issues and building an
OS that let multiple applications share one (augmented) reality.