The need

Correctly capturing an object’s position, orientation and identity is a major challenge – without prior information, stereo optics or measurements – it can be hard to measure scale or distance, and object recognition requires a large labelled dataset.

The idea

Convolutional neural networks (CNN) have made significant strides in object recognition, classification and segmentation, as used in self-driving vehicles, for example. PoseTracker leverages the power of CNN to recognise and track objects in 3D.

The solution

PoseTracker uses a patented optical marker approach to infer an object’s pose from 2D images, then tracks the position from one image to all subsequent images – based on comparisons to a predefined 3D orientation.

Technical details for PoseTracker

Convolutional neural networks, a class of deep neural network, have made significant strides in the recent years in terms of object recognition, classification and segmentation leading to significant development in self driving vehicles and a great variety of computer vision applications.

However, there have been very few practical implementations of these advanced approaches in object 3D pose estimation. The ability to recognise and track the object in the 3D reference space is still a difficult problem to resolve due to some several challenging issues:

  1. The 3D pose information is hard to capture, requiring complicated set-ups involving stereo optical or magnetic localisation apparatus.
  2. The lack of prior information about the object of interest.
  3. A labelled dataset with the proper pose information is very hard to obtain in large quantity. The traditional image manipulation like axis scaling and transformations will inevitably corrupt 3D pose information.

The idea is to leverage the power of CNNs and implement an application to recognise and track the pose (position and orientation) of objects in 3D with a patented optical marker that will help to identify the rotation and estimate the pose of the object.

PoseTracker is a proof of concept for a simple object pose detection pipeline, integrated with rotation information based on a 3D pose tracking solution (an optical marker).

The application analyses the 2D images taken from a camera with the optical marker always visible. The application, with supervised training, detects the marker that infers its orientation information from one image to all subsequent images based on comparison to a predefined 3D orientation.

This different approach to solve the pose tracker issues will help in the future, to use your phone camera get the angle, orientation and distance that an object is from you in real time.

Resources:

Projects related to PoseTracker

Browse more business scenario projects

Explore the possibilities of AI

Jump-start your own AI innovations with learning resources and development solutions from Microsoft AI.