We present a completely automatic method for obtaining the approximate calibration of a camera (alignment to a world frame and focal length) from a single image of an unknown scene, provided only that the scene satisfies a Manhattan world assumption. This assumption states that the imaged scene contains three orthogonal, dominant directions, and is often satisfied by outdoor or indoor views of man-made structures and environments.
The proposed method combines the calibration likelihood introduced in ‘Manhattan world’ (J.M. Coughlan and A.L. Yuille) with a stochastic search algorithm to obtain a MAP estimate of the camera’s focal length and alignment. Results on real images of indoor scenes are presented. The calibrations obtained are less accurate than those from standard methods employing a calibration pattern or multiple images. However, the outputs are certainly good enough for common vision tasks such as tracking. Moreover, the results are obtained without any user intervention, from a single image, and without use of a calibration pattern.