a tall building lit up at night

Microsoft Research Lab – Asia

Matching points between objects of different shapes and styles

October 25, 2017

Miran Lee, Outreach Director, Microsoft Research Asia

Many computer vision applications require the points on an object in one image to be matched with their corresponding object points in another image, such as a car door handle matched to a different model of car’s door handle. Dealing with such appearance variations over different objects is useful for object segmentation, image editing and other applications. This matching problem is the central element of most vision-based shape reconstruction techniques, which find the 3D position of each scene point based on where it appears in images taken from different camera locations.

When the objects in the images are not quite the same, such as different models of cars, matching can become especially difficult. As in the example above, corresponding points on the door handles of two different cars are challenging to estimate, because the changes in visual properties that the handles may exhibit, such as color, shape and orientation, are confounding to computer vision algorithms. Geometric variations, in particular, are an obstacle that has lacked a satisfactory solution in matching algorithms.

Recently, researchers at Microsoft Research Asia and Yonsei University, using deep neural networks, developed a solution to this problem. Collaborating with Dr. Steve Lin, a principal researcher at Microsoft Research Asia, Yonsei PhD student Seungryong Kim and his advisor Professor Kwanghoon Sohn have devised a system that is able to deal with geometric variations in the form of affine transformations through discrete-continuous transformation matching (DCTM). The continuous space of possible affine transformations is too large to exhaustively search for a geometric transformation that enables matching between corresponding regions of two objects. In this work, the researchers address this issue by solving for a match via optimization over a limited set of discrete transformation candidates, and iteratively updating the discrete candidate set through continuous regularization based on the affine transformations obtained over the entire object. In this way, the technique draws solutions from the continuous space of affine transformations in a manner that can be computed efficiently.

The result of applying DCTM to compute matches between two cars is shown in the figure, where matched points are coded by color. Although there exist differences in the color, geometry and features of the two cars, the system can accurately identify matching points on each car. With this matching information, the source car can be transformed into the shape of the target car.

This research on discrete-continuous transformation matching is being presented at the 2017 International Conference on Computer Vision (ICCV) in Venice, Italy, October 22-29. As an oral presentation paper, this work is among the top 3 percent of submissions to this conference.