Microsoft Research Blog

The Microsoft Research blog provides in-depth views and perspectives from our researchers, scientists and engineers, plus information about noteworthy events and conferences, scholarships, and fellowships designed for academic and scientific communities.

A Strong Sense for Natural Interactions

October 5, 2014 | By Microsoft blog editor

Hrvoje Benko

Hrvoje Benko

This week, Microsoft researcher Hrvoje Benko (@hrvojebenko) is in Hawaii, but not on one of the islands’ beautiful beaches. As conference chair for UIST 2014—the 27th Association for Computing Machinery (ACM) Symposium on User Interface Software and Technology—Benko will be busy ensuring that the event, the premier forum for innovations in the software and technology of human-computer interaction (HCI), proceeds smoothly.

In addition to serving as conference chair, Benko—subject of a new Microsoft Research Luminaries video—has contributed to three of the eight Microsoft papers being presented during the conference, including Sensing Techniques for Tablet+Stylus Interaction, which won an ACM UIST 2014 Best Paper Award. This is the fourth paper he’s co-written that’s received a best-paper award by a major conference, evidence of a remarkable career for a young scientist who joined Microsoft Research in 2007, just after earning his Ph.D.

Bringing Augmented Reality to Ordinary Spaces

The award-winning paper, though, might not be the most attention-grabbing contribution to UIST 2014 from Benko and his colleagues. Many attendees are certain to be amazed by the research featured in the paper RoomAlive: Magical Experiences Enabled by Scalable, Adaptive Projector-Camera Units, written by Benko and a host of Microsoft and academic collaborators.

RoomAlive uses a unified, scalable, multiprojector system that adapts gaming content to its room and explores additional methods for physical interaction in the room, resulting in an immersive gaming experience.

“RoomAlive,” Benko says, “enables any space to be transformed into an augmented, interactive display.”

It will be a while, though, before that proof-of-concept could become a practical reality, but it certainly is tantalizing. The RoomAlive prototype deploys projectors and depth cameras to cover the entire room—including the furniture and people inside—with pixels capable of being used for both input and output. The presentation results in an experience that coexists seamlessly with the existing physical environment.

The system that drives RoomAlive uses multiple projector-camera units—referred to as “procams”—consisting of a depth camera, a wide-field-of-view projector, and a computer. These devices are combined via a scalable, distributed framework to cover an entire room. The procams are auto-calibrating and can self-localize as long as their views have some overlap.

“Our system enables new interactive projection mapping experiences that dynamically adapts content to any room,” Benko says. “Users can touch, shoot, stomp, dodge, and steer projected content that seamlessly coexists with their existing physical environment.”

Another augmented-reality paper being presented during UIST 2014, written by Benko, Microsoft researcher Andy Wilson and designer Federico Zannier, uses different technologies, this time to support face-to-face—“dyadic”—interaction with 3-D virtual objects. Dyadic Projected Spatial Augmented Reality combines dynamic projection mapping, multiple perspective views, and device-less interaction.

The main advantage of spatial augmented reality (SAR) over more traditional augmented-reality approaches, such as handheld devices with composited graphics or see-through, head-worn displays, is that users are able to interact with 3-D virtual objects and each other without bulky equipment with a limited field of view that hinders face-to-face interaction. Instead, SAR projects the augmenting graphics over the physical object itself and avoids diverting the users’ attention from the real world.

“That used to be the domain of theme parks or highly immersive theaters,” Benko says. “You needed a space that was designed for the experience. But now, with enough computing power, depth cameras, and projectors, it’s possible to create these immersive environments within an ordinary living space.

“Augmented reality fundamentally changes the nature of communication, with rich interactions not just for entertainment, but also for work and collaboration.”

Enabling More Natural, Nuanced Interactions

The paper that won a UIST 2014 best-paper award represents a huge joint effort. Written by team lead Ken Hinckley along with colleagues, Benko, Michel Pahud, Pourang Irani, François Guimbretière, Marcel Gavriliu, Xiang ‘Anthony’ Chen, Fabrice Matulic, Bill Buxton, and Wilson, it explores grip and motion sensing with a tablet and a stylus.

“We’re at the point now where small mobile devices, such as pens, tablets, or phones, can be equipped with sensors to help us understand their use,” Benko says. “The way you grasp the tablet, whether you hold the stylus in a writing grip or tucked between your fingers—these all affect the position and movement of your gestures.”

The biomechanics behind each task are anything but ordinary, involving the interplay between two hands, each containing 27 bones, more than 30 muscles, nearly 50 nerves, and 30 or so arteries.

The team’s goal was to capitalize on the hand’s dexterity to explore new frontiers in human-computer interaction using new sensing techniques.

“How can we interpret the signals correctly, more accurately?” Benko muses. “That’s the larger goal of the work: Instead of assuming explicit interaction, how can we enable a more natural and nuanced interface based on context of use? How can we impart new subtleties to interactions on mobile devices?”

Current device-interaction models tend to rely on having the user explicitly select the state of interaction. Models in which devices present context-based behaviors are few and still fairly simple: accelerometers, which flip screens between portrait and landscape modes, or Bluetooth, which makes automatic, appropriate connections to a vehicle, a home phone, or a computer. The researchers wanted to extend the interaction vocabulary and build on the existing range of gestures so that more nuanced context is possible.

“It’s an attempt to create a set of gestures that are meaningful to a particular task and span the space of possibilities,” he says. “We know that not all the gestures are going to be successful. But it’s only by introducing them and obtaining feedback that this vocabulary will eventually reduce down to a few highly functional interactions that get widely adopted. This is still a long ways off from moving into the mainstream, but this is the exciting part for researchers—opening up new possibilities and working to make them useful.”

Fascinated by the Human Aspects of Computing

HCI first caught Benko’s interest during his undergraduate years.

“I was doing a lot of programming but found myself asking, ‘Where do people come into the equation?’” says Benko, a native of Croatia. “So for my graduate studies at Columbia University, I continued on with computing but focused on the human aspects. I was really lucky to work there with some of the pioneers in augmented reality.”

Currently, augmented reality is what captures his imagination the most. The notion that technology can augment our senses and alter how we comprehend reality was what initially inspired him to go into HCI research.

“It’s almost like you’re giving people superpowers,” he says, laughing. “I was fascinated with the idea that computing could be a tool that lets you do things you couldn’t do before, or do them faster, or change how you perceive reality. I was just really interested in the notion that computing should be about interacting with people.”

Flying High at Microsoft Research

Since joining Microsoft, Benko’s work has spanned many different areas, from augmented reality, computational illumination, surface computing, and new input form factors and devices, to touch and freehand gestural input. When he arrived in the United States at age 16, though, Benko never imagined he would contribute to award-winning papers or collaborate with eminent scientists on scientific papers and journals.

“I came to the U.S. through an exchange-student scholarship,” he says. “I attended a prep school, at one of those places with old buildings straight out of the Dead Poets Society. I had a really good time, my scholarship got extended for a second year, and I went on to university in the U.S., and then graduate school.”

Benko calls his 2005 internship at Microsoft an “eye-opener.” For one thing, he found himself in an office a few doors down the hall from researchers whose papers he had been reading and referencing.

“There were so many luminaries whose work I revered,” he recalls. “Not in my wildest dreams could I have imagined having lunch and chatting about my work with Andy Wilson or Ken Hinckley. They’re world-renowned experts in their fields.”

While sometimes working at Microsoft Research can feel like a day at the beach for Benko, this week won’t. With three papers and a demo to present, not to mention his responsibilities as conference chair, the beach, ironically, might have to be experienced virtually.

Up Next

Artificial intelligence, Computer vision, Human-computer interaction

Inside AR and VR, a technical tour of the reality spectrum with Dr. Eyal Ofek

Episode 91, September 25, 2019 - Dr. Eyal Ofek is a senior researcher at Microsoft Research and his work deals mainly with, well, reality. Augmented and virtual reality, to be precise. A serial entrepreneur before he came to MSR, Dr. Ofek knows a lot about the “long nose of innovation” and what it takes to bring a revolutionary new technology to a world that’s ready for it. On today’s podcast, Dr. Ofek talks about the unique challenges and opportunities of augmented and virtual reality from both a technical and social perspective; tells us why he believes AR and VR have the potential to be truly revolutionary, particularly for people with disabilities; explains why, while we’re doing pretty well in the virtual worlds of sight and sound, our sense of virtual touch remains a bit more elusive; and reveals how, if he and his colleagues are wildly successful, it won’t be that long before we’re living in a whole new world of extension, expansion, enhancement and equality.

Microsoft blog editor

Hardware and devices

Manipulating Space and Time in Mixed Reality

Virtual Reality and Augmented Reality each has distinct advantages when it comes to bringing digital capabilities into our everyday lives. Everything users see, including the environment and every object can be controlled and changed. This also means that what users see doesn’t necessarily mirror the real world. If another user should enter the room, for […]

Microsoft blog editor

Hardware and devices, Human-computer interaction

Project Zanzibar: Blurring the distinction between the digital and the physical worlds via tangible interaction in a portable implementation

It was a love of toys, a shared appreciation for the intrinsic beauty of physical objects and a recognition of their absence in the daily computer interactions of a world that currently spends most of its time gazing at and touching two-dimensional glass that propelled a team of Microsoft researchers in Cambridge, UK and Redmond, […]

Microsoft blog editor