We introduce an approach to accurately detect and segment partially occluded objects in various viewpoints and scales. Our main contribution is a novel framework for combining object-level descriptions (such as position, shape, and color) with pixel-level appearance, boundary, and occlusion reasoning. In training, we exploit a rough 3D object model to learn physically localized part appearances. To ﬁnd and segment objects in an image, we generate proposals based on the appearance and layout of local parts. The proposals are then reﬁned after incorporating object-level information, and overlapping objects compete for pixels to produce a ﬁnal description and segmentation of objects in the scene. A further contribution is a novel instance penalty, which is handled very efﬁciently during inference. We experimentally validate our approach on the challenging PASCAL’06 car database.