Portrait of Pushmeet Kohli

Pushmeet Kohli

Principal Research Manager
Director of Research
Microsoft Research


Pushmeet Kohli is a principal research manager for Microsoft Research. Formerly, he was the technical advisor to Rick Rashid, the Chief Research Officer of Microsoft. He is also an associate of the Psychometric Centre and Trinity Hall, University of Cambridge.

Pushmeet’s research revolves around Intelligent Systems and Computational Sciences, and he publishes in the fields of Machine Learning, Computer Vision, Information Retrieval, and Game Theory. His current research interests include 3D Reconstruction and Rendering, Probabilistic Programming, Interpretable and Verifiable Knowledge Representations from Deep Models. He is also interested in Conversation agents for Task completion, Machine learning systems for Healthcare and 3D rendering and interaction for augmented and virtual reality.

Pushmeet has won a number of awards and prizes for his research. His PhD thesis, titled “Minimizing Dynamic and Higher Order Energy Functions using Graph Cuts”, was the winner of the British Machine Vision Association’s “Sullivan Doctoral Thesis Award”, and was a runner-up for the British Computer Society’s “Distinguished Dissertation Award”. Pushmeet’s papers have appeared in Computer Vision (ICCV, CVPR, ECCV, PAMI, IJCV, CVIU, BMVC, DAGM), Machine Learning, Robotics and AI (NIPS, ICML, AISTATS, AAAI, AAMAS, UAI, ISMAR), Computer Graphics (SIGGRAPH, Eurographics), and HCI (CHI, UIST) conferences. They have won awards in ICVGIP 2006, 2010, ECCV 2010, ISMAR 2011, TVX 2014, CHI 2014, WWW 2014 and CVPR 2015. His research has also been the subject of a number of articles in popular media outlets such as Forbes, Wired, BBC, New Scientist and MIT Technology Review. Pushmeet is a part of the Association for Computing Machinery’s (ACM) Distinguished Speaker Program.


Computer Vision

  • Structured Representations for Visual Knowledge and Commonsense
  • Low-level vision problems: Image Segmentation, Dense Stereo, Optical Flow
  • Object Recognition and Segmentation
  • Human Pose Estimation from KINECT
  • Localization and Reconstruction using KINECT

Machine Learning

  • Verifiable and Interpetable Models
  • Probablistic Programming
  • MAP Inference in Discrete Models (Discrete Optimization)
  • Structured Learning
  • Learning of Interactive Systems

Game Theory

  • Behavioral game theory research using social networks such as Facebook
  • Finding Optimal Coalitions in Cooperative Games
  • Reconstructing Coalitional Games
  • Computing Optimal Coalition Structures

Information Retrieval

  • Personalizing Search
  • Psycho-metric profiles for capturing user intent

Curriculum Vitae can be found here.



Established: December 5, 2016

ALICE Automated Learning and Intelligence for Causation and Economics Alice is a project to direct Artificial Intelligence towards economic decision making.  We are building tools that combine state-of-the-art machine learning with econometrics – the measurement of economic systems -- in order to bring automation to economic decision making.   The heart of this project is a striving to measure causation: if you want to understand or make policy decisions in a complex economy, you need to…

Neural Program Synthesis

Established: June 15, 2016

In the Cognition group at Microsoft Research, we’re working on developing new neural architectures to automatically learn from specifications such as input-output (I/O) examples. This is useful in automating the development of computer programs that map to a user’s intent—what we call “program synthesis.” The act of programming computing devices is a complex task. Computer scientists have been attempting to solve the problem of program synthesis to automatically create a computer program that is consistent with a…

SemanticPaint: Interactive 3D Labeling and Learning at your Fingertips

Established: June 29, 2015

We present a new interactive approach to 3D scene understanding. Our system, SemanticPaint, allows users to simultaneously scan their environment, whilst interactively segmenting the scene simply by reaching out and touching any desired object or surface. Our system continuously learns from these segmentations, and labels new unseen parts of the environment. Unlike offline systems, where capture, labeling and batch learning often takes hours or even days to perform, our approach is fully online. To be…

Project Malmo

Established: June 1, 2015

How can we develop artificial intelligence that learns to make sense of complex environments? That learns from others, including humans, how to interact with the world? That learns transferable skills throughout its existence, and applies them to solve new, challenging problems? https://youtu.be/KkVj_ddseO8 Project Malmo sets out to address these core research challenges, addressing them by integrating (deep) reinforcement learning, cognitive science, and many ideas from artificial intelligence. The Malmo platform is a sophisticated AI experimentation…

Learning to be a depth camera for close-range human capture and interaction

Established: July 14, 2014

We present a machine learning technique for estimating absolute, per-pixel depth using any conventional monocular 2D camera, with minor hardware modifications. Our approach targets close-range human capture and interaction where dense 3D estimation of hands and faces is desired. We use hybrid classification-regression forests to learn how to map from near infrared intensity images to absolute, metric depth in real-time. We demonstrate a variety of human computer interaction scenarios.  

KinectFusion Project Page

Established: August 9, 2011

This project investigates techniques to track the 6DOF position of handheld depth sensing cameras, such as Kinect, as they move through space and perform high quality 3D surface reconstructions for interaction. Other collaborators (missing from the list below): Richard Newcombe (Imperial College London); David Kim (Newcastle University & Microsoft Research); Andy Davison (Imperial College London)    

Human Pose Estimation for Kinect

Established: January 25, 2011

Kinect for Xbox 360 and Windows makes you the controller by fusing 3D imaging hardware with markerless human-motion capture software. Our group investigates such software. Mixing computer vision, graphics, and machine learning techniques, we look at how to build algorithms that can learn to recognize human poses quickly and reliably. Images Traditional RGB image

Image Understanding

Established: January 1, 2000

At Microsoft Research in Cambridge we are developing new machine vision algorithms for automatic recognition and segmentation of many different object categories. We are interested in both the supervised and unsupervised scenarios.   Research data Download labelled image databases for supervised learning in the "Downloads" link below. The data provided here may be used freely for research purposes but it cannot be used for commercial purposes. Database of thousands of weakly labelled, high-res images. Pixel-wise labelled…




Visual Storytelling
Ting-Hao (Kenneth) Huang, Francis Ferraro, Nasrin Mostafazadeh, Ishan Misra, Aishwarya Agrawal, Jacob Devlin, Ross Girshick, Xiaodong He, Pushmeet Kohli, Dhruv Batra, C. Lawrence Zitnick, Devi Parikh, Lucy Vanderwende, Michel Galley, Margaret Mitchell, in Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL HLT), 2016, June 13, 2016, View abstract, Download PDF, View external link














Kinect Gesture Data Set

April 2012

    Click the icon to access this download

  • Website

Surface Extraction from Binary Volumes with Higher-Order Smoothness

August 2009

    Click the icon to access this download

  • Website

Branch-and-Mincut Algorithm for Image Segmentation

April 2009

    Click the icon to access this download

  • Website

Unwrap Mosaic—Embedding

August 2008

    Click the icon to access this download

  • Website


Press Coverage

Reconstructing and Depth Sensing from Standard Smartphones

Probabilistic Programming for Visual Perception

Semantic Understanding of 3D Spaces

Behavioral Game Theory

Kinect Fusion – Generating 3D Reconstructions

Unwrap Mosaics (Next-Gen Video Editing)


  • July 2016, Paper on continuous relaxation based approaches for efficient inference in Dense CRFs will appear in ECCV 2016.
  • June 2016, Our paper describing the role of time spent by crowdworkers can play while aggregating responses in crowdsourcing systems will appear in the Journal of Artificial Intelligence (JAIR).
  • May 2016, New work on Hand pose estimation will appear in SIGGRAPH 2016.
  • May 2016, Our paper on real time full body reconstruction (Fusion 4D) will appear in SIGGRAPH 2016.
  • March 2016, We have developed a super efficient algorithm (Global Patch Collider) for correspondece estimation that will appear in CVPR 2016.
  • March 2016, Paper on Layered Scene Decomposition via the Occlusion-CRF will be presented at CVPR 2016.
  • February 2016, Our new Visual stortelling dataset and its accompanying paper will be released at NAACL 2016.
  • February 2016, Cloze Test – our proposal for evaluating the ability to cature narrative structure – to be released in our NAACL 2016 paper.
  • January 2016, Moving to the Microsoft headquaters in Redmond to start a new AI team.
  • December 2015, Paper on Non-greedy training of decision trees appears in NIPS 2015.
  • December 2015, Paper on interpretable autoencoders – Deep Convolutional Inverse Graphics Network is presented at NIPS 2015.
  • September 2015, Our paper on Non-greedy training of decision trees will appear in NIPS 2015.
  • September 2015, Our paper on interpretable autoencoders – Deep Convolutional Inverse Graphics Network will appear in NIPS 2015.
  • August 2015, MobileFusion has received a lot of press converage (see articles in (WIRED, Mashable, and even the Register)
  • August 2015, Our paper Mobile Fusion that describes a method for generating 3D reconstruction of objects using a standard smartphone has been accepted to appear in ISMAR 2015.
  • July 2015, Our paper on Maximum Flows by Incremental Breadth-First Search (with Andrew Goldberg, Sagi Hed, Robert Tarjan, Renato Werneck and Haim Kaplan) will appear at the European Symposium on Algorithms (ESA)
  • July 2015, Interactive Scene Understanding (Semantic Paint) is covered widely in the Press (BBC, Engadget, and even the Daily Mail).
  • July 2015, Our paper on learning to decipher heaps for software verificiation wins the Best Paper Award at the Constructive ML workshop at ICML 2015.
  • June 2015, Our paper on “Interactive Scene Understanding” – a collaboration with University of Oxford and Stanford will be presented at SIGGRAPH 2015. See video here.
  • June 2015, The “Picture Probablitic Programming” paper receives the Best Paper Honorable Mention Award at CVPR 2015.
  • May 2015, Paper on learning Perturbation Models with Multidimensional Parametric Min-cuts has been accepted to appear in UAI 2015.
  • May 2015, Paper on Information Gathering in Networks via Active Exploration has been accepted to appear in IJCAI 2015.
  • May 2015, Video of the Hand Pose Estimation work is now available here.
  • April 2015, Picture, a new programming language for vision problems, has been accepted to appear in CVPR 2015 and has received a lot of press coverage (MIT News, The Register, Scientific Computing, iProgrammer).
  • April 2015, Paper on Real-time Hand Pose Estimation is appearing in CHI 2015.
  • March 2015, Our paper on fast hashing has been accepted to appear in CVPR 2015.
  • February 2015, Paper on Crowdsourcing Language Understanding in the Wild is accepted to appear in WWW 2015
  • January 2015, Paper on Consensus Message Passing is accepted to appear in AISTATS 2015
  • December 2014, Joined the Advisory board for the NEMOG (New Economic Models for Digital Games) project.
  • November 2014, Paper on Learning with Multiple Annotation-specific Loss Functions is accepted to appear in EMMCVPR 2015.
  • October 2014, Paper and Video describing our work on automatic layout of virtual objects for augmented reality released at ISMAR 2014.
  • October 2014, Paper on Just-in-Time Inference is accepted to appear in NIPS 2014.
  • September 2014, Paper on Real-time Face Reconstruction from a Single Depth Image is accepted to appear in 3DV 2014.
  • August 2014, Our paper on Exploration of Group Viewing Patterns Paper is the Runner-up for the Best Paper Award at TVX 2014.
  • July 2014, 3 papers (contour completion, learning with perceptual loss functions, Non-parametric Higher-order MRFs) are accepted to appear in ECCV 2014.
  • June 2014, Our demo on FilterForest for Image Denoising is the Runner-up for the Best Demo Award at CVPR 2014.
  • May 2014, Paper on automatic layout of virtual objects for augmented reality is accepted to appear at ISMAR 2014.
  • April 2014, Paper on community priors for crowdsourcing is the runner-up for the best paper award at WWW 2014.
  • March 2014, Paper on Depth from IR illumination fall-off is accepted to appear in SIGGRAPH 2014.
  • February 2014, 3 Papers (Learning Portfolios for camera relocatlization, Personalized gesture recognition, Filter Forests for Image Labelling) are accepted to appear in CVPR (2 for oral presentation).
  • January 2014, Paper on encoraging diversity in multiple-output prediction is accepted for oral presentation at AISTATS 2014.
  • January 2014, Paper on community priors for crowdsourcing is accepted to appear in WWW 2014.
  • December 2013, Our paper on the effect of principles on power of agents in strategic games is accepted to appear in AAMAS 2014.
  • December 2013, Paper on User Behaviour Adaptation Under Interface Change accepted to appear in IUI 2014.
  • November 2013, Our Infer.Net based model for crowdsourcing won the CrowdScale challenge at HCOMP 2013.
  • October 2013, Gave talk at the Human Behaviour Understanding workshop at ACM multimedia.
  • September 2013, Paper on Decision DAGs accepted to appear in NIPS 2013.
  • September 2013, Invited to join the editorial board of CVIU.
  • August 2013, Paper on Sementic labelling of Voxel Spaces accepted to appear in ICCV 2013.
  • July 2013, Paper on text detection and recognition accept to appear at ICDAR 2013.
  • June 2013, Tutorial on Solving real world problems with RGBD sensors at CVPR 2013.
  • June 2013, Gave a talk at the London School of Economics (LSE) on the value of Big Data.
  • April 2013, Papers on online algorithms for diverse recommendations and computation of coalition structures in coalition games are accepted to appear in AAAI 2013.
  • March 2013, Four papers (two for oral presentation) accepted to appear in CVPR 2013.
  • February 2013, Invited to join the editorial board of IJCV.
  • January 2013, Paper on faster training of structural SVMs is accepted to appear in AISTATS 2012 (here).
  • October 2012, PAMI paper describing the pose estimation system for KINECT has been accepted.
  • September 2012, Two papers (1) structured output prediction with multiple choices (2) context driven random forests are accepted to appear in NIPS 2012.
  • August 2012, The accepted ECCV and DAGM papers are now available online.
  • July 2012, Jamie Shotton and I have written a new book chapter on our recent work on Human Pose Estimation for the KINECT.
  • June 2012, 5 Papers accepted to appear in the European Conference on Computer Vision (ECCV).
  • May 2012, The MSR KINECT gesture dataset collected for our CHI 2012 paper is now available.
  • April 2012, 3 Papers on Behavioural Game Theory and Personality-Online Behaviour patterns are accepted to appear in ACM WebSci 2012 .
  • March 2012, Papers accepted to CHI 2012, AISTATS 2012, Eurographics 2012 and PAMI/IJCV are now available online.
  • Februray 2012, New Facebook game Doubloon Dash designed to study strategies used in all-pay auctions is now available on Facebook .
  • January 2012, Carsten Rother and I are teaching the Advanced Computer Vision course in the Engineering department of the University of Cambridge.
  • December 2011, Videos describing some of my research can be found here.
  • November 2011, Selected to join ACM’s Distinguished Speaker Program.
  • October 2011, Our paper on simulataneous localization and 3D mapping wins the Best Paper Award at ISMAR 2011.
  • October 2011, Interviews on the role of Behavioural game theory appear in Forbes and The Economic Times.
  • September 2011, Our first Facebook app (MSR Project Waterloo) for conducting behavioural game theory experiments is now online!.
  • September 2011, Kinect fusion gets reviewed by MIT Technology review (full article).
  • September 2011, The KinectFusion system has been made public. (Project Page) (video) (See UIST and ISMAR papers for detail)
  • September 2011, The ICCV papers on Decision Tree Fields and Regression for Human Pose Estimation are now online.
  • June 2011, Our book on Advances in Markov Random Fields for Vision and Image Processing has been published by MIT press.
  • June 2011, Slides for the invited tutorial at IBPRIA are now online! (See Invited Tutorial on MAP Inference in Discrete Models)
  • June 2011, 4 papers on improving speed and accuracy of conventional algorithms for MAP inference by making them energy or problem-aware have appeared in AISTATS, ICML and CVPR 2010.
  • December 2011, Our paper on evaluation and learning of interactive Segmentation systems wins the Best Paper Award at ICVGIP 2010.
  • October 2010, Our paper on inference with co-occurence potential wins the Best Paper Award at ECCV 2010.


Professional Duties

Academic Duties

Program Committee Member – Reviewing Duties

  • Journal of Machine Learning Research (JMLR)
  • IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI)
  • International Journal of Computer Vision (IJCV)
  • Computer Vision and Image Understanding (CVIU)
  • Neural Information Processing Systems (NIPS)
  • IEEE International Conference on Computer Vision (ICCV)
  • IEEE International Conference on Computer Vision and Pattern Recognition (CVPR)
  • European Conference on Computer Vision (ECCV)
  • International Conference on Artificial Intelligence and Statistics(AISTATS)
  • British Machine Vision Conference (BMVC)


Talks, Courses, & Tutorials

  • NIPS 2009 Workshop Talk on Learning and Evaluating Interactive Segmentation Systems
    Whistler, December 2009. [slides]
  • ICCV 2009 Tutorial on MAP Inference in Discrete Models
    Kyoto, September 2009. [slides]


Dear collaborator, please send me a mail if your name is missing from this list and you would want me to add it here.

Past and Current Students and Interns

  • Lubor Ladicky, PhD student 2007-2011, (now Post-doc at Oxford)
  • Dhruv Batra, Intern 2010 (now Asst. Professor at TTI Chicago)
  • Michal Kosinski, Intern 2010 (now at Psychometrics Centre, Cambridge)
  • Patrick Pletscher, Intern 2010 (PhD student at ETH)
  • Bangpeng Yao, Intern 2010 (PhD student at Stanford)
  • Olga Barinova, Intern 2009 (now at Moscow State University)
  • Hannes Nickish, Intern 2009 (now at Philips Research)
  • Sara Vicente, Intern 2008
  • Dheeraj Singaraju, Intern 2008
  • Kyomin Jung, Intern 2008 (now Asst. Professor at KAIST)

Other Collaborators

  • Shahram Izadi, MSR Cambridge
  • Yoram Bachrach, MSR Cambridge
  • Thore Graepel, MSR Cambridge
  • Jamie Shotton, MSR Cambridge
  • Sebastian Nowozin, MSR Cambridge
  • Carsten Rother, MSR Cambridge
  • Andrew Fitzgibbon, MSR Cambridge
  • Otmar Hilliges, MSR Cambridge
  • Philip Torr, Oxford Brookes University
  • M Pawan Kumar, Ecole Centrale Paris
  • Richard Newcombe, Imperial
  • Chang Yoo, KAIST