Cliplets: Juxtaposing Still and Dynamic Imagery

Neel Joshi, Sisil Metha, Steven Drucker, Eric Stollnitz, Hugues Hoppe, Matt Uyttendaele, Michael Cohen

MSR-TR-2012-52 |

We explore creating “cliplets”, a form of visual media that juxtaposes still image and video segments, both spatially and temporally, to expressively abstract a moment. Much as in “cinemagraphs”, the tension between static and dynamic elements in a cliplet reinforces both aspects, strongly focusing the viewer’s attention. Creating this type of imagery is challenging without professional tools and training. We develop a set of idioms, essentially spatiotemporal mappings, that characterize cliplet elements, and use these idioms in an interactive system to quickly compose a cliplet from ordinary handheld video. One difficulty is to avoid artifacts in the cliplet composition without resorting to extensive manual input. We address this with automatic alignment, looping optimization and feathering, simultaneous matting and compositing, and Laplacian blending. A key user-interface challenge is to provide affordances to define the parameters of the mappings from input time to output time while maintaining a focus on the cliplet being created. We demonstrate the creation of a variety of cliplet types. We also report on informal feedback as well as a more structured survey of users.

Cliplets: Juxtaposing Still and Dynamic Imagery

A still photograph is a limited format for capturing moments that span an interval of time. Video is the traditional method for recording durations of time, but the subjective “moment” that one desires to capture is often lost in the chaos of shaky camerawork, irrelevant background clutter, and noise that dominates most casually recorded video clips. This work provides a creative lens used to focus on important aspects of a moment by performing spatiotemporal compositing and editing on video-clip input. This is an interactive app that uses semi-automated methods to give users the power to create “cliplets”—a type of imagery that sits between stills and video from handheld videos.