I will introduce a novel feed-forward architecture for semantic segmentation. We map small image elements (superpixels) to rich feature representations extracted from a sequence of nested regions of increasing extent. These regions are obtained by “zooming out” from the superpixel all the way to scene-level resolution. Our approach exploits statistical structure in the image and in the label space without setting up explicit structured prediction mechanisms, and thus avoids complex and expensive inference. Instead superpixels are classified by a feedforward multilayer network with skip-layer connections spanning the zoomout levels. Using off-the-shelf network pre-trained on ImageNet classification task, this zoom-out architecture achieves 69.6% average accuracy on the PASCAL VOC 2012 test set, near current state of the art. Joint work with Mohammadreza Mostajabi and Payman Yadollahpour.