woman facing a video screen using a Living Display for a video coference

Living Display

Visually natural teleconferencing

two men facing each other in a video conference using Living Display
two men facing each other in a video conference using Living Display
a man and a woman facing each other in a video conference using Living Display

Videoconferencing has become an integral part of our lives and was a critical instrument for personal and professional interaction during the COVID-19 pandemic. Businesses, education, and interpersonal relationships have been transformed in recent years by the adoption of video communication, which continues to grow as our society discovers the benefits of hybrid work. However, we as humans evolved over millennia to perceive verbal and non-verbal social cues that are critical during in-person communication. Current videoconferencing technologies are unable to fully replicate them. The purpose of this research is to provide an immersive, visually natural videoconferencing experience that is much closer to an in-person, face-to-face meeting than existing conferencing products. Our approach provides direct eye contact, true motion dynamics, and does so with no requirement for a head-mounted display.

The Living Display system allows you to look directly at your counterpart on the monitor and for your counterpart to experience you looking directly at them, rather than facing cameras above or below your display. In a typical video conference, when you change your position relative to the monitor, what you see stays the same. Our system goes beyond eye contact correction and provides correct motion dynamics of the person in 3D. As you move, what is shown on the monitor changes much as it would if you were looking through an actual physical window into another room. This video illustrates our vision for the Living Display system and how it can help bring a better sense of presence for videoconferencing.

Technology

To capture fluid motion dynamics, two Azure Kinect sensors positioned above the display, and one below, provide RGB plus depth information (RGB-D) to create a 3D model of you and your surroundings using Microsoft’s HoloportationTM technology. The model is either rendered at source and streamed as video, or the raw RGB-D data is transmitted across the network and rendered into a 3D model just before presentation to remote participants. You view remote participants the same way, providing a real-time, natural conferencing experience without the need for head-mounted displays.

The depth sensors around the local display allow tracking of your current position and adjustment of the local view based on your gaze. Hence, the correct parallax effect of looking through a window is provided. Two-way data communication ensures that the participants on either side experience those motion dynamics in the video call.