Reinforcement Learning Group

The Reinforcement Learning group works on theoretical foundations, algorithms, and systems for autonomous decision making. Our main research areas include exploration-exploitation trade-offs, off-policy learning, and generalization for contextual bandits, Markov decision processes, and contextual decision processes. Our techniques have been successfully applied to many domains, including online advertising, recommendations, web search, conversational systems, games, and program synthesis.


Research Team

Current Interns

  • Portrait of Bo Dai

    Bo Dai

    PhD candidate

    Georgia Institute of Technology

  • Portrait of Da Tang

    Da Tang

    PhD Intern

    Columbia University

Past Interns



Microsoft Research blog


  • “Deep Reinforcement Learning for Conversational Systems” by Lihong Li
    June 14, 2017 | RLDM 2017