{"id":748969,"date":"2020-01-15T20:27:14","date_gmt":"2020-01-16T04:27:14","guid":{"rendered":"https:\/\/www.microsoft.com\/en-us\/research\/?post_type=msr-research-item&#038;p=748969"},"modified":"2021-05-26T20:27:59","modified_gmt":"2021-05-27T03:27:59","slug":"exploring-reinforcement-learning-methods-from-algorithm-to-application","status":"publish","type":"msr-video","link":"https:\/\/www.microsoft.com\/en-us\/research\/video\/exploring-reinforcement-learning-methods-from-algorithm-to-application\/","title":{"rendered":"Exploring Reinforcement Learning Methods from Algorithm to Application"},"content":{"rendered":"<p>Reinforcement learning (RL) is a systematic approach to learning and decision making under uncertainty. Developed and studied for decades, recent combinations of RL with modern deep learning have led to impressive demonstrations of the capabilities of today&#8217;s RL systems, and these new combinations have fueled an explosion of interest and research activity.<\/p>\n<p>In this webinar led by Microsoft researcher Dr. Katja Hofmann, a Principal Researcher in the Game Intelligence group at Microsoft Research Cambridge, learn about the foundations of RL\u2014elegant ideas giving rise to agents that can learn extremely complex behaviors in a wide range of settings. In the broader perspective, gain an overview of where we currently stand in terms of what is possible in RL from the researcher&#8217;s perspective. The webinar concludes with an outlook on key opportunities\u2014both for future research and real-world applications of RL.<\/p>\n<p>Together, you&#8217;ll explore:<\/p>\n<ul>\n<li>Why a Markov Decision Process is a simple yet powerful abstraction for reinforcement learning problems<\/li>\n<li>How to model a task as a reinforcement learning problem<\/li>\n<li>The challenge of balancing exploration and exploitation in reinforcement learning<\/li>\n<li>One of the fundamental approaches to reinforcement learning problems, Q-Learning, and how it solves the credit assignment problem<\/li>\n<li>Q-learning with function approximation<\/li>\n<\/ul>\n<p><strong>Resource list:<\/strong><\/p>\n<ul>\n<li><a href=\"https:\/\/www.microsoft.com\/en-us\/research\/group\/deep-reinforcement-learning\">Game Intelligence<\/a> (Research group)<\/li>\n<li><a href=\"https:\/\/www.microsoft.com\/en-us\/research\/theme\/reinforcement-learning-group\">Reinforcement Learning<\/a> (Research group)<\/li>\n<li><a href=\"https:\/\/www.microsoft.com\/en-us\/research\/project\/project-malmo\">Project Malmo<\/a> (Project page)<\/li>\n<li><a href=\"https:\/\/www.microsoft.com\/en-us\/research\/blog\/optimistic-actor-critic-avoids-the-pitfalls-of-greedy-exploration-in-reinforcement-learning\">Optimistic Actor Critic avoids the pitfalls of greedy exploration in reinforcement learning<\/a> (Blog)<\/li>\n<li><a href=\"https:\/\/www.microsoft.com\/en-us\/research\/podcast\/malmo-minecraft-and-machine-learning-with-dr-katja-hofmann\">Malmo, Minecraft and machine learning with Dr. Katja Hofmann<\/a> (Podcast)<\/li>\n<li><a href=\"https:\/\/www.microsoft.com\/en-us\/research\/blog\/project-malmo-competition-returns-with-student-organizers-and-a-new-mission-to-democratize-reinforcement-learning\">Project Malmo competition returns with student organizers and a new mission: To democratize reinforcement learning<\/a> (Blog)<\/li>\n<li><a href=\"https:\/\/www.microsoft.com\/en-us\/research\/publication\/reinforcement-learning-past-present-and-future-perspectives\">Reinforcement Learning: Past, Present, and Future Perspectives<\/a> (Publication)\n<ul>\n<li>Learn about advanced topics in Reinforcement\u202fLearning:\u202f<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" target=\"_blank\" href=\"http:\/\/aka.ms\/neurips-2019-rl-tutorial\">aka.ms\/neurips-2019-rl-tutorial<span class=\"sr-only\"> (opens in new tab)<\/span><\/a><\/li>\n<li>Get started with the Malmo platform:\u202f<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/github.com\/Microsoft\/malmo\" target=\"_blank\" rel=\"noopener noreferrer\">github.com\/Microsoft\/malmo<span class=\"sr-only\"> (opens in new tab)<\/span><\/a><\/li>\n<li><a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/minerl.io\/competition\/\" target=\"_blank\" rel=\"noopener noreferrer\">Results of the\u202fMineRL\u202fcompetition 2019 @NeurIPS<span class=\"sr-only\"> (opens in new tab)<\/span><\/a><\/li>\n<\/ul>\n<\/li>\n<li><a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/kahofman\">Katja Hofmann<\/a> (Researcher profile)<\/li>\n<\/ul>\n<p>*This on-demand webinar features a previously recorded Q&A session and open captioning.<\/p>\n<p>This webinar originally aired on January 15, 2020<\/p>\n<p>Explore more Microsoft Research webinars: <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" target=\"_blank\" href=\"https:\/\/aka.ms\/msrwebinars\">https:\/\/aka.ms\/msrwebinars<span class=\"sr-only\"> (opens in new tab)<\/span><\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Reinforcement learning (RL) is a systematic approach to learning and decision making under uncertainty. Developed and studied for decades, recent combinations of RL with modern deep learning have led to impressive demonstrations of the capabilities of today&#8217;s RL systems, and these new combinations have fueled an explosion of interest and research activity. In this webinar [&hellip;]<\/p>\n","protected":false},"featured_media":748972,"template":"","meta":{"msr-url-field":"","msr-podcast-episode":"","msrModifiedDate":"","msrModifiedDateEnabled":false,"ep_exclude_from_search":false,"_classifai_error":"","msr_hide_image_in_river":0,"footnotes":""},"research-area":[13556],"msr-video-type":[],"msr-locale":[268875],"msr-post-option":[],"msr-session-type":[],"msr-impact-theme":[],"msr-pillar":[],"msr-episode":[],"msr-research-theme":[],"class_list":["post-748969","msr-video","type-msr-video","status-publish","has-post-thumbnail","hentry","msr-research-area-artificial-intelligence","msr-locale-en_us"],"msr_download_urls":"","msr_external_url":"https:\/\/youtu.be\/LsztMquHGDg","msr_secondary_video_url":"","msr_video_file":"","_links":{"self":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-video\/748969","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-video"}],"about":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/types\/msr-video"}],"version-history":[{"count":1,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-video\/748969\/revisions"}],"predecessor-version":[{"id":748975,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-video\/748969\/revisions\/748975"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/media\/748972"}],"wp:attachment":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/media?parent=748969"}],"wp:term":[{"taxonomy":"msr-research-area","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/research-area?post=748969"},{"taxonomy":"msr-video-type","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-video-type?post=748969"},{"taxonomy":"msr-locale","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-locale?post=748969"},{"taxonomy":"msr-post-option","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-post-option?post=748969"},{"taxonomy":"msr-session-type","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-session-type?post=748969"},{"taxonomy":"msr-impact-theme","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-impact-theme?post=748969"},{"taxonomy":"msr-pillar","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-pillar?post=748969"},{"taxonomy":"msr-episode","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-episode?post=748969"},{"taxonomy":"msr-research-theme","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-research-theme?post=748969"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}