Microsoft Research Blog

The Microsoft Research blog provides in-depth views and perspectives from our researchers, scientists and engineers, plus information about noteworthy events and conferences, scholarships, and fellowships designed for academic and scientific communities.

Group Shot: Getting Everyone to Smile

June 6, 2006 | By Microsoft blog editor

By Rob Knies, Managing Editor, Microsoft Research

Beauty, they say, is in the eye of the beholder. The subtext of that hoary proverb, of course, is that sometimes the beholder’s eye doesn’t see things quite the same way that the rest of us do. The mental image—what appears in the mind’s eye—is what lingers, and that image, personalized and unique, is susceptible to many influences.

Images produced by a camera—unencumbered by the unrivaled variety of filters inherent in the human brain—are more cut and dried. That picture of your boyfriend with his eyes closed? Toss it. The family photo in which Grandpa is looking off-camera while everybody else is beaming? Close, but …

A good photographer can compensate for such commonplace occurrences. As for the rest of us? Well, we’ve got Group Shot.

A product of Microsoft Research’s Redmond lab, Group Shot was developed by Alex Colburn, Matt Uyttendaele, and Michael Cohen of the Interactive Visual Media group over a period of six weeks late in 2005. Group Shot has the express purpose of helping a user to improve a flawed group photo to the desired state envisioned when the picture was taken.

“Photographs are instants of time,” Colburn smiles, “but we don’t necessarily remember an instant in time. We’re actually remembering a moment, and our brains backfill in the details. For example, you don’t remember your family portrait as a moment when everyone has their eyes closed and their mouths open. You remember a moment when everyone is smiling and looks good. With a camera, it is hard to capture those perfect photos, because those moments might not have existed, or you may have just missed them.”

“What we want to do is to use multiple photos to help reconstruct a moment.”

Group Shot, available for download, makes it easy for a user to take a part of an image from one photo and replace a similar but flawed part of an image in another photo, thereby creating a composite shot better than either original and more approximating what was seen in the photographer’s mind’s eye.

The inspiration for Group Shot extends back a few years, to a 2003 paper called Image Stacks, written by Cohen, Colburn, and Steven Drucker, all then with Microsoft Research. In the introduction to that paper, the authors wrote:

“Taking group photographs can be frustrating, because capturing a single image in which everyone is smiling and has their eyes open is nearly impossible. Most photographers take a series of photographs, hoping to capture at least one satisfactory image of the group. However, this approach may never yield such an image. On the other hand, within the series of images, it is likely that at least one good image of each individual within the group will be captured. A group photograph could be created by combining the best portions of each individual image into a single composite image.”

The idea of manipulating an image using assets from a number of similar images is the key, but to do that, you need to have those similar images.

With Group Shot, Colburn says: “I wanted to expose the world to this idea of taking a lot of photos. Once you have a stack or series of images, you can and start doing all sorts of interesting things with them. Take a lot of photos; don’t just take one!”

Another paper—entitled Interactive Digital Photomontage and written in 2004 by Aseem Agarwala and Mira Dontcheva of the University of Washington; Maneesh Agrawala of the University of California Berkeley; Drucker and Colburn of Microsoft Research; Brian Curless of the University of Washington; and David Salesin and Cohen, then both with Microsoft Research—addresses many of the technical challenges to piecing together parts of images: choosing good seams between parts of the various images and removing unwanted elements resulting from the joined images.

The authors of the second paper utilized a technique called graph-cut optimization to find the best possible seams along which to cut source images. That, combined with image-alignment technology and a desire to make a usable tool, led to Group Shot. This image-alignment technology, also developed by the Interactive Visual Media group, has been incorporated into the panoramic stitching feature of Microsoft’s Expression Graphic Designer.

Shooting the photographs in a brief span of time helps eliminate the sort of motion that can hamper Group Shot’s effectiveness.

“If everybody’s moving around a lot,” Colburn says, “it might not be possible to get a combination where you get a good shot. If you can take four photos where everyone is trying to stand still and smile, you’re bound to get a good shot of everybody at least once.”

Such front-end care can lead to quick and easy success with Group Shot.

“If you’ve ever tried to copy a face from one image to another using photo-editing software, you know that it can be difficult,” Colburn says. “You have to copy and paste each section and use a smearing brush to smooth out the edges. It’s time-consuming and challenging.”

“With Group Shot, you could do this in minutes, versus hours, and without a whole lot of domain knowledge. People who are good at using PhotoShop can do these things pretty quickly, but for the rest of us, it’s hard. Group Shot is fast, and it’s easy.”

There are many other things Group Shot either can do now or will be able to do as the application is extended. If it enables you to improve on a photo in which Aunt Mamie’s mouth is agape, it also could be of use in other scenarios in which photo reality just doesn’t measure up to the persistence of memory.

  • Removing unwanted images from photo backgrounds: “You can work in positive space,” Colburn says, “where you’re building a composite of things you like, or you can work in negative space, where you’re taking out things you don’t like.”
  • Artistic shots with special lighting: “If you wanted to be a little more artistic about how you choose your lighting, you could create lighting situations that you could never encounter for real. There are a lot of somewhat more artistic things that you can do—using this to paint with light.”
  • Extended depth of field: creating an image with an extended focal range from a series of images focused at different depths. This technique can be particularly useful for macro photography.
  • Aesthetically pleasing images that could never occur in reality: “You can get things that don’t exist,” Colburn says. “You can take a picture of a building against a sunset, but you want to have the sky looking blue—however you want to compose these things that might not exist in reality but might be how you remember them.”
  • Museum-type artifacts lit unrealistically from all directions in order to best display their details: “A good photographer can usually do a lot of these things that the amateur photographer can’t,” Colburn says, “but it might be a really cheap and easy way to reduce the cost of getting good photos or displays for the Web.”

It is telling regarding the expected use of Group Shot that Colburn, who put the technology together, is hardly a professional shutterbug.

“I’m a hobbyist photographer,” he admits. “I work with images and image-based data representations in my research, but I don’t have any professional photography background. It’s something that I do for fun. I take pictures of my family and dogs.”

And that’s what Group Shot is all about: giving the average, digital-camera-equipped person images worthy of pride.

“It lets me fix my photos,” Colburn says. “I can send a photo to my mom that otherwise would have been just a lousy shot that wasn’t worth saving. That’s the coolest thing for me in Group Shot: It actually works, and I use it.”

Up Next

Audio and Acoustics, Graphics and multimedia

Project Triton and the physics of sound with Dr. Nikunj Raghuvanshi

Episode 68, March 20, 2019 - Today, Dr. Raghuvanshi talks about the unique challenges of simulating realistic sound on a budget (both money and CPU), explains how classic ideas in concert hall acoustics need a fresh take for complex games like Gears of War, reveals the computational secret sauce you need to deliver the right sound at the right time, and tells us about Project Triton, an acoustic system that models how real sound waves behave in 3-D game environments to makes us believe with our ears as well as our eyes.

Microsoft blog editor

Artificial intelligence, Computer vision, Graphics and multimedia

Teaching computers to see with Dr. Gang Hua

Episode 28, June 13, 2018 - Dr. Hua talks about how the latest advances in AI and machine learning are making big improvements on image recognition, video understanding and even the arts. He also explains the distributed ensemble approach to active learning, where humans and machines work together in the lab to get computer vision systems ready to see and interpret the open world.

Microsoft blog editor

Data platforms and analytics, Graphics and multimedia, Human-computer interaction

Visualizing Data and Other Big Ideas with Dr. Steven Drucker

Episode 5, December 20th, 2017 - In a wide-ranging interview, veteran Microsoft Researcher, Dr. Steven Drucker talks about his work in data visualization, the importance of clear communication in a world of complex algorithms and big data, and the long, slow work of big breakthroughs. He also offers some pro-tips to aspiring researchers, and tells us why stand-up comedy is an important skill for computer scientists.

Microsoft blog editor