Iconic Scene Graphs and Iconic Summaries for Internet Photo Collections – (joint work with Rahul Raguram, Xiaowei Li, Changchang Wu, Christopher Zach, and Jan-Michael Frahm)


June 30, 2008


Svetlana Lazebnik


University of North Carolina at Chapel Hill


In this talk, I will present techniques for organizing two types of photo collections downloaded from Flickr.com: (1) famous landmark sites such as the Statue of Liberty and (2) general visual concepts such as “love” and “beauty.”

In the first part of the talk, I will present a system that integrates 2D appearance and 3D geometric constraints to efficiently build 3D models of landmarks, extract scene summaries, and recognize the landmark in new test images. The system starts by clustering images using low-dimensional global “gist” descriptors and then performs geometric verification to retain only the clusters whose images share a common 3D structure. Each valid cluster is represented by a single iconic view, and geometric relationships between iconic views are captured in an iconic scene graph. In addition to serving as a compact scene summary, this graph is used to guide structure from motion to efficiently produce 3D models covering most of the scene content. The set of iconic images can be also used for recognition, i.e., determining whether new test images contain the landmark.

In the second part of the talk, I will discuss the problem of computing iconic summaries for more general visual concepts that are not characterized by rigid 3D geometry. In this case, it is more appropriate to define iconic images as representatives of subsets of the collection consistent in terms of global 2D appearance and semantics. Such subsets are found by jointly clustering images using “gist” descriptors and Flickr tags. For each joint cluster, a representative iconic image is selected using an automatic quality-based ranking scheme. To visualize the resulting summary, iconic images are grouped according to their semantic “theme” (tag-based cluster) and multidimensional scaling is used to compute a 2D layout reflecting the relationships between the themes.


Svetlana Lazebnik

Svetlana Lazebnik has received the B.S. degree in computer science from DePaul University in Chicago, IL, in 2000 and the M.S. degree in computer science from the University of Illinois at Urbana- Champaign (UIUC) in 2002. She has completed her Ph.D. thesis, entitled “Local, Semi-Local and Global Models for Texture, Object and Scene Recognition,” at UIUC in the spring of 2006. This work was co-supervised by Prof. Jean Ponce at UIUC and Dr. Cordelia Schmid at INRIA Rhone-Alpes.In July of 2007, Svetlana has joined the Computer Science department at the University of North Carolina at Chapel Hill as an assistant professor. Her research interests include computer vision, object and scene recognition, and machine learning.