Interactive surfaces and related tangible user interfaces often involve everyday objects that are identified, tracked, and augmented with digital information. Traditional approaches for recognizing these objects typically rely on complex pattern recognition techniques, or the addition of active electronics or fiducials that alter the visual qualities of those objects, making them less practical for real-world use. Radio Frequency Identification (RFID) technology provides an unobtrusive method of sensing the presence of and identifying tagged nearby objects but has no inherent means of determining the position of tagged objects. Computer vision, on the other hand, is an established approach to track objects with a camera. While shapes and movement on an interactive surface can be determined from classic image processing techniques, object recognition tends to be complex, computationally expensive and sensitive to environmental conditions. We present a set of techniques in which movement and shape information from the computer vision system is fused with RFID events that identify what objects are in the image. By synchronizing these two complementary sensing modalities, we can associate changes in the image with events in the RFID data, in order to recover position, shape and identification of the objects on the surface, while avoiding complex computer vision processes and exotic RFID solutions.