Dictionary Learning for 3D Scene Representation


May 6, 2013


Ivana Tošic


Ricoh Innovations, Inc


Recent development of 3D technologies and depth sensing devices have posed new challenges in processing of depth maps, which are crucial elements in 3D rendering and scene analysis. Typical image processing approaches to these challenges rely on transformations to appropriate representations (e.g., wavelets). However, because of the differences between image and depth statistics, existing image representations might not generalize well to efficiently represent structures in depth maps.

One way to develop representations of data with unknown statistics is to learn them from a large database of examples. I will first present a new method for learning dictionaries of waveforms in which depth maps have sparse linear decompositions. The proposed method differs from existing approaches because it is robust to spatially varying noise typical for depth measurements. The effectiveness of this method will be demonstrated on denoising of maps obtained from depth sensors. I will then introduce an algorithm for learning dictionaries that encompass two modalities of 3D scenes: image intensity and depth information. When trained on data from hybrid image-depth sensors, these representations converge to a set of related features, such as pairs of depth and intensity edges or image textures and depth slants. I will conclude by showing some depth inpainting examples based on these hybrid representations.

This is joint work with Bruno A. Olshausen, Benjamin J. Culpepper and Sarah Drewes.


Ivana Tošic

Ivana Tošic (ivana@rii.ricoh.com) received the Dipl.Ing. degree in telecommunications from the University of Niš, Serbia, and the Ph.D. degree in computer and communication sciences from the Swiss Federal Institute of Technology (EPFL), Lausanne, Switzerland. From 2009 to 2011, she was a postdoctoral researcher at the Redwood Center for Theoretical Neuroscience, University of California at Berkeley. She is currently an advisory research scientist at Ricoh Innovations, Inc., Menlo Park, California. Her research interests lie in the intersection of image processing and computational neuroscience domains and include binocular vision, 3-D scene representation, depth perception, representation and coding of the plenoptic function.