Visual Computing

Established: July 3, 2010

Overview

Computer Vision is an exciting new research area that studies how to make computers efficiently perceive, process, and understand visual data such as images and videos. The ultimate goal is for computers to emulate the striking perceptual capability of human eyes and brains, or even to surpass and assist the human in certain ways. The Visual Computing Group at Microsoft Research Asia consists of an elite team of researchers and engineers whose expertise spans the entire spectrum of research topics in computer vision, from mathematical theory to practical applications, from physical systems to software development, and from low-level image processing to high-level image understanding. Research results from our group have made fundamental impacts on many important applications such as New High-Resolution Cameras, Face Recognition, Image Search, Virtual Earth, and Graphics & Games.

More specifically, our research activities are centered around several main research thrusts:

  1. Imaging and Photogrammetry, including high-resolution cameras, radiometric calibration, photometric stereos, 3D imaging and video, and image and video enhancement.
  2. Pattern Recognition and Statistical Learning, including data clustering and classification, manifold learning, and high-dimensional geometry and statistics.
  3. Object Detection and Recognition, including face detection, alignment, and tagging, video-based face recognition, and sparsity-based robust face recognition.
  4. Dynamical Vision, including object tracking, video motion analysis and edition, video summarization, video motion and object segmentation, dynamical photometric stereo.
  5. Interactive and Internet Vision, including interactive image segmentation, completion, and normal reconstruction, and image search and re-ranking, and large scale image and object retrieval, large volume of images visualization.

Group News & Activities

  • Tutorial on Robust PCA and its Applications by John, Zhouchen, and Yi at International Conference on Image Processing, Hongkong, September 2010.
  • Tutorial on Photometric Stereo by Yasuyuki, Bennett, and Moshe at International Conference on Image Processing, Hongkong, September 2010.
  • The group manager Yi Ma was interviewed by CNN about an article “Why Face Recognition isn’t Scary — yet.” July 9, 2010
  • Our group has 15 papers published at the IEEE international conference on Computer Vision and Pattern Recognition (CVPR), June 2010.
  • The group manager Yi Ma gave the plenary speech at the international conference on Visual Communications and Image Processing, July 2010.
  • Microsoft Research featured story “Yi Ma and the Blessing of Dimensionality.” May 28, 2010.
  • A new giga-pixel digital camera is developed by our researcher Moshe Ben-Ezra.

Hightlighted Projects

A Glimpse at Several Representative Projects:

  • A Giga-Pixel Digital Camera (by Dr. Moshe Ben-Ezra): This revolutionary camera represents the state-of-the-art commercially affordable (less than $25K) solution to high-quality and high-resolution imaging. The camera produces high-quality images at the resolution of 1.6 giga pixels. It has been used to digitize artworks or antiques with unprecedented details. For example, combining with photometric stereo, images captured by this camera can recover striking 3D surface details of oil paintings and hence help reveal the artist’s skills and style. This camera has broad applications in cultural heritage, archeology, and art preservation and insurance etc.
  • Robust Processing and Analysis of High-Dimensional Data (John Wright, Zhouchen Lin, Yi Ma): The need to detect and correct gross errors and outliers arises in problems throughout computational data analysis. For example, in many computer vision problems erroneous measurements arise due to occlusion, tracker failure, or due to violations of an assumed model (i.e., specularities in face recognition or photometric stereo). Correctly handling such non-ideal observations is essential to building systems that work under real-world conditions. We are working to meet this need with new algorithmic tools based on convex optimization. These algorithms are scalable and efficient, and come with sharp performance guarantees based on concentration of measure in high-dimensional spaces. These new tools have made revolutionary impacts on important problems such as highly robust face recognition and robust principal component analysis.
  • Photo Album Management – Face Tagging (Fang Wen, Jian Sun): Nowadays, more and more people take huge amount of photos in their daily life. The final goal of the photo album management work is help users to manage, search, share and make fun from these photos easily. ‘Who is in the photo’ is a good clue to organize and share photos. However, tagging people name is a tedious job for the user. Our Face Tagging work is trying to combine state-of-art face recognition and clustering technologies with a friendly user interface to make tagging effortless and fun.
  • Video Analysis and Synthesis (Yichen Wei, Yasuyuki Matsushita): We work on the problem of analysis, browsing, and automated synthesis of videos. Video is an important medium that becomes more and more popular with the increasing availability of video cameras. Many previous approaches take natural extension from image analysis and synthesis, and the dynamics in the video is often disregarded. We are interested in the dynamics in videos, and use it for further analyses and applications. Past works include video stabilization, video completion, video object tracking and global motion analysis. We are innovating new technologies capitalizing on the visual dynamics in videos.
  • Interactive Image Segmentation and Cut-Out (Jian Sun): The problem of efficient, interactive foreground/background segmentation in still images is of great practical importance in image editing. As the research outputs, we have developed a scribble-based tool (Lazy Snapping) and a painting-based tool (Paint Selection). Using our tools, the user can effortlessly select an interested object/region with minimal user assistance, for the applications from object cut-and-paste to local color/tone adjustment.

People

Posts

Office Lens Is a Snap

The moment mobile-phone manufacturers added cameras to their devices, they stopped being just mobile phones. Not only have lightweight phone cameras made casual photography easy and spontaneous, they also have changed the way we record our lives. Now, with help from Microsoft Research, the Office team is out to change how we document our lives in another way—with the Office Lens app for Windows Phone 8. Office Lens, now available in the Windows Phone Store,…

March 2014

Microsoft Research Blog

Helping Kinect Recognize Faces

By Douglas Gantenbein, Senior Writer, Microsoft News Center To use a Kinect for Xbox 360 gaming device is to see something akin to magic. Different people move in and out of its view, and Kinect recognizes the change in a player and responds accordingly. It accomplishes this task despite the enormous variation in what it sees. Lighting can change within a room. A player might appear close to the Kinect one minute, farther away the…

October 2011

Microsoft Research Blog

Beijing Lab’s New Initiative: eHeritage

By Rob Knies, Managing Editor, Microsoft Research Leonardo da Vinci and Filippo Brunelleschi resound through history as two of the guiding lights of the Italian Renaissance. Leonardo, of course, gifted us with the Mona Lisa and The Last Supper, but he also excelled at mathematics, engineering, anatomy, botany, and a clutch of additional artistic endeavors. Brunelleschi, meanwhile is heralded for his dome atop the Duomo in Florence, but in addition to that engineering feat, he…

April 2009

Microsoft Research Blog

Microsoft Research Asia’s Guo Discusses the Future of Graphics

By Rob Knies, Managing Editor, Microsoft Research Microsoft Research Asia, celebrating its 10th anniversary, recently presented its annual Computing in the 21st Century academic symposium, held over two days, Nov. 4 in Beijing and Nov. 7 in Singapore. For the second consecutive year, Baining Guo, assistant managing director of the lab, served as chair of the event. “We’re very excited,” Guo said of the anniversary festivities. “It’s a big celebration.” The morning after the Singapore…

December 2008

Microsoft Research Blog

Hightlighted Projects

A Glimpse at Several Representative Projects:

  • A Giga-Pixel Digital Camera (by Dr. Moshe Ben-Ezra): This revolutionary camera represents the state-of-the-art commercially affordable (less than $25K) solution to high-quality and high-resolution imaging. The camera produces high-quality images at the resolution of 1.6 giga pixels. It has been used to digitize artworks or antiques with unprecedented details. For example, combining with photometric stereo, images captured by this camera can recover striking 3D surface details of oil paintings and hence help reveal the artist’s skills and style. This camera has broad applications in cultural heritage, archeology, and art preservation and insurance etc.
  • Robust Processing and Analysis of High-Dimensional Data (John Wright, Zhouchen Lin, Yi Ma): The need to detect and correct gross errors and outliers arises in problems throughout computational data analysis. For example, in many computer vision problems erroneous measurements arise due to occlusion, tracker failure, or due to violations of an assumed model (i.e., specularities in face recognition or photometric stereo). Correctly handling such non-ideal observations is essential to building systems that work under real-world conditions. We are working to meet this need with new algorithmic tools based on convex optimization. These algorithms are scalable and efficient, and come with sharp performance guarantees based on concentration of measure in high-dimensional spaces. These new tools have made revolutionary impacts on important problems such as highly robust face recognition and robust principal component analysis.
  • Photo Album Management – Face Tagging (Fang Wen, Jian Sun): Nowadays, more and more people take huge amount of photos in their daily life. The final goal of the photo album management work is help users to manage, search, share and make fun from these photos easily. ‘Who is in the photo’ is a good clue to organize and share photos. However, tagging people name is a tedious job for the user. Our Face Tagging work is trying to combine state-of-art face recognition and clustering technologies with a friendly user interface to make tagging effortless and fun.
  • Video Analysis and Synthesis (Yichen Wei, Yasuyuki Matsushita): We work on the problem of analysis, browsing, and automated synthesis of videos. Video is an important medium that becomes more and more popular with the increasing availability of video cameras. Many previous approaches take natural extension from image analysis and synthesis, and the dynamics in the video is often disregarded. We are interested in the dynamics in videos, and use it for further analyses and applications. Past works include video stabilization, video completion, video object tracking and global motion analysis. We are innovating new technologies capitalizing on the visual dynamics in videos.
  • Interactive Image Segmentation and Cut-Out (Jian Sun): The problem of efficient, interactive foreground/background segmentation in still images is of great practical importance in image editing. As the research outputs, we have developed a scribble-based tool (Lazy Snapping) and a painting-based tool (Paint Selection). Using our tools, the user can effortlessly select an interested object/region with minimal user assistance, for the applications from object cut-and-paste to local color/tone adjustment.

A Few Free Downloadables: