A new, exemplar-based, probabilistic paradigm for visual tracking is presented. Probabilistic mechanisms are attractive because they handle fusion of information, especially temporal fusion, in a principled manner. “Exemplars” are selected representatives of raw training data, used here to represent probabilistic mixture distributions of object configurations. Their use avoids tedious hand-construction of object models, and problems with changes of topology. Using exemplars in place of a parameterized model poses several challenges, addressed here with what we call the “Metric Mixture” (M2) approach. The M2 model has several valuable properties. Principally, it provides alternatives to standard learning algorithms by allowing the use of metrics that are not embedded in a vector space. Secondly, it uses a noise model that is learned from training data. Lastly, it eliminates any need for an assumption of probabilistic pixelwise independence. Experiments demonstrate the effectiveness of the M2 model in two domains: tracking walking people using chamfer distances on binary edge images, and tracking mouth movements by means of a shuffle distance.