We propose a framework for extracting structure from stereo which represents the scene as a collection of approximately planar layers. Each layer consists of an explicit 3D plane equation, a colored image with per-pixel opacity, and a per-pixel depth offset relative to the plane. Initial estimates of the layers are recovered using techniques taken from parametric motion estimation. These initial estimates are then refined using a re-synthesis algorithm which takes into account both occlusions and mixed pixels. Reasoning about such effects allows the recovery of depth and color information with high accuracy, even in partially occluded regions. Another key benefit is that the output consists of a collection of approximately planar regions, a representation which is better suited than a dense depth map for many applications such as rendering and video parsing.