Most existing structure from motion (SFM) approaches for unordered images cannot handle multiple instances of the same structure in the scene. When image pairs containing different instances are matched based on visual similarity, the pairwise geometric relations as well as the correspondences inferred from such pairs are erroneous, which can lead to catastrophic failures in the reconstruction.
In this paper, we investigate the geometric ambiguities caused by the presence of repeated or duplicate structures and show that to disambiguate between multiple hypotheses requires more than pure geometric reasoning. We couple an expectation maximization (EM)-based algorithm that estimates camera poses and identifies the false match-pairs with an efficient sampling method to discover plausible data association hypotheses. The sampling method is informed by geometric and image-based cues. Our algorithm usually recovers the correct data association, even in the presence of large numbers of false pairwise matches.