Microsoft Research Blog

Artificial intelligence

RSLAM: A System for Large-Scale Mapping in Constant-Time Using Stereo

August 31, 2011

Large scale exploration of the environment requires a constant time estimation engine. Bundle adjustment or pose relaxation do not fulfil these requirements as the number of parameters to solve grows with the size of the environment. We describe a relative simultaneous localisation and mapping system…
Geodesic Forests for Image Editing

August 11, 2011 | Toby Sharp, Antonio Criminisi, and Patrick Perez

A Geodesic Forest is a new representation of digital color images which yields ﬂexible and efﬁcient editing algorithms. In this paper an image is decomposed into a collection of trees (a forest) whose branches follow directions of minimum variation. This representation enables expensive, 2D, edge-aware…
KinectFusion: real-time dynamic 3D surface reconstruction and interaction

August 7, 2011

We present KinectFusion, a system that takes live depth data from a moving Kinect camera and in real-time creates high-quality, geometrically accurate, 3D models. Our system allows a user holding a Kinect camera to move quickly within any indoor space, and rapidly scan and create…
Geodesic Image and Video Editing

August 1, 2011 | Antonio Criminisi, Toby Sharp, Carsten Rother, and Patrick Perez

This article presents a new, unified technique to perform general edge-sensitive editing operations on n-dimensional images and videos efficiently.The first contribution of the article is the introduction of a Generalized Geodesic Distance Transform (GGDT), based on soft masks. This provides a unified framework to address…
Optimizing subpixel rendering using a perceptual metric

July 31, 2011

— ClearType is a subpixel-rendering method designed to improve the perceived quality of text. The method renders text at subpixel resolution and then applies a one-dimensional symmetric mean-preserving filter to reduce color artifacts. This paper describes a computational method and experimental tests to assess user…
Spatial decision forests for MS lesion segmentation in multi-channel magnetic resonance images.

July 15, 2011

A new algorithm is presented for the automatic segmentation of Multiple Sclerosis (MS) lesions in 3D Magnetic Resonance (MR) images. It builds on a discriminative random decision forest framework to provide a voxel-wise probabilistic classification of the volume. The method uses multi-channel MR intensities (T1,…
Entangled decision forests and their application for semantic segmentation of CT images

July 3, 2011

This work addresses the challenging problem of simultaneously segmenting multiple anatomical structures in highly varied CT scans. We propose the entangled decision forest (EDF) as a new discriminative classifier which augments the state of the art decision forest, resulting in higher prediction accuracy and shortened…
Combining generative and discriminative models for semantic segmentation of CT scans via active learning

July 3, 2011

This paper presents a new supervised learning framework for the efficient recognition and segmentation of anatomical structures in 3D computed tomography (CT), with as little training data as possible. Training supervised classifiers to recognize organs within CT scans requires a large number of manually delineated…
PAC-Bayesian learning with asymmetric cost

June 27, 2011 | Ashley J. Llorens and I-Jeng Wang

PAC-Bayes generalization bounds offer a theoretical foundation for learning classifiers with low generalization error and predicting their performance on unseen data. Current formulations implicitly assume that the relative cost of misclassifying a positive or negative example is reflected by the class skew in the training…
Online learning with minority class resampling

May 21, 2011 | Michael J. Pekala and Ashley J. Llorens

This paper considers using online binary classification for target detection where the goal is to identify signals of interest within a sequence of received signals generated by a shifting background. In this setting, we assume there is significant class imbalance (100∶1 or greater), the sequence…
Harvesting Image Databases from the Web

April 1, 2011 | F. Schroff, Antonio Criminisi, and A. Zisserman

The objective of this work is to automatically generate a large number of images for a specified object class. A multimodal approach employing both text, metadata, and visual features is used to gather many high-quality images from the Web. Candidate images are obtained by a…
Robust linear registration of CT images using random regression forests

March 3, 2011

Global linear registration is a necessary first step for many different tasks in medical image analysis. Comparing longitudinal studies1, cross-modality fusion2, and many other applications depend heavily on the success of the automatic registration. The robustness and efficiency of this step is crucial as it…

No results