Portrait of Andrey Kolobov

Andrey Kolobov

Principal Researcher


I am a member of the Adaptive Systems and Interaction group (ASI) at MSR Redmond. My general research interests lie in the theory of decision-making under uncertainty and its applications, ranging from building AI for autonomous sailplane UAVs to designing algorithms for Bing’s next-generation Web crawler.

I graduated with a Ph.D. from the CSE Department of the University of Washington, where I had been advised by Dan Weld and Mausam in June 2013. My thesis focused on mathematical models and scalable domain-independent algorithms for planning under uncertainty.

Before the Ph.D. adventure, I had worked for 2 years at Microsoft’s Desktop Search group, and yet before that received a double B.A. in computer science and applied mathematics at UC Berkeley. While at Berkeley, I also participated in research on a probabilistic first-order logic language called BLOG with Brian Milch and Stuart Russell. Going back to prehistoric times, I finished secondary school #1234 (this is the school’s actual number) in Moscow, Russia.





Approximation Algorithms for Planning Under Uncertainty

Approximation Algorithms for Planning Under Uncertainty, ICAPS’13

Probabilistic Planning with Markov Decision Processes

Probabilistic Planning with Markov Decision Processes, slightly different versions of which were given at the ICAPS’12 Summer School and AAAI’12


Markov Decision Processes (MDPs) are a powerful tool for sequential decision-making under uncertainty, a core subfield of Artificial Intelligence. Their applications include planning advertising campaigns, putting together a scientific agenda for a Mars Rover, playing games, and many others.

This tutorial takes an algorithmic approach to covering MDPs. It starts by describing the basic MDP types: finite, total discounted reward, and stochastic shortest-path MDPs. It then proceeds to techniques for solving them, from the oldest optimal ones (Value Iteration, Policy Iteration), to heuristic search algorithms (LAO*, LRTDP, etc.), to state-of-the-art approximation approaches such as RFF — a special emphasis area of this tutorial. The lecture will conclude by discussing extensions to the established MDP classes and promising research directions.

The tutorial attendees are not expected to have any prior familiarity with MDPs, and will walk away with the knowledge of this field sufficient for conducting research in it.


Approximate Outline

  1. Introduction and a Theory of MDPs
    • Motivation
    • Common MDP classes with guaranteed solution existence
      • Finite-Horizon MDPs
      • Infinite-Horizon discouted-Reward MDPs
      • Stochastic Shortest-Path MDPs
    • SSPs as the general problem class
    • Complexity of solving MDPs
  2. Optimal Solution Algorithms
    • Value Iteration
    • Asynchronous Value Iteration
    • Policy Iteration
    • Modified Policy Iteration
    • Prioritized Value Iteration
      • Prioritized Sweeping
      • Improved Prioritized Sweeping
      • Focused Dynamic Programming
      • Backward Value Iteration
    • Partitioned Value Iteration
      • General Framework
      • Topological Value Iteration
    • LP-based Algorithms
    • Special case of Infinite-Horizon Discounted-Reward MDPs
  3. Heuristic Search Algorithms
    • LAO* and extensions (iLAO*, RLAO*, BLAO*)
    • RTDP and extensions (LRTDP, FRTDP, BRTDP, Bayesian RTDP)
    • Focused Topological VI
    • A note on techniques for computing bounds (lower, upper)
    • Heuristic search and SSP MDPs with dead ends
  4. Approximation algorithms
    • Factored MDPs as the basis for approximation techniques
    • Determinization-based approaches
      • FF-Replan
      • FF-hindsight
      • RFF
      • HMDPP
      • Determinization-based approaches and SSP MDPs with dead ends
    • Sampling-based approaches
      • UCT
    • Heuristic search with inadmissible heuristics
      • The FF Heuristic
      • The GOTH Heuristic
    • Dimensionality Reduction Approaches
      • ReTrASE
      • FPG
      • Approximate Policy Iteration, Approxiate Linear Programming
    • Hierarchical Planning
    • Hybridized Planning
  5. Extensions to MDPs/Research Directions

I have also written Glutton and Gourmand, two solvers for large finite-horizon MDPs with high branching factors.

Français du Canada English